欢迎来到cool的博客
7

Music box

Click to Start

点击头像播放音乐
新博客链接

SQL中left join、right join、inner join的区别 | Rails joins & includes & eager_load

SQL中left join、right join、inner join的区别

三个join的含义:

  • left join(左联接):返回左表中的所有记录以及和右表中的联接字段相等的记录。
  • right join(右联接):返回右表中的所有记录以及和左表中的联接字段相等的记录。
  • inner join(等值联接):只返回两个表中联接字段相等的记录。

下面将以atable、btable为例进行讲解。
atable:

tablea.png

btable:

tableB.png

left join
select * from atable as a
left join btable as b
on a.name = b.name;

结果如下:

 

leftjoin.png


可以看到,tableA join tableB的结果中,既有tableA中的所有记录,同时还包括了与右表中联接字段相等的记录,所以返回的记录总数一定是大于或等于tableA的记录总数。

 

right join的就不赘述了。

inner join
select * from atable as a
inner join btable as b
on a.name = b.name;

结果如下:

innerjoin.png

可以看到,tableA inner join tableB的结果中,只包含了tableA与tableB联接字段相等的记录。可以看作是tableA和tableB的交集。

 

Rails joins & includes & eager_load

includes和joins的不同 
当includes和joins的时候最重要的概念就是他们有他们的典型用例。 
includes使用贪婪加载(eager loading)而joins使用懒加载(lazy loading),两者都非常有用,但是也都很容易被滥用导致程序性能降低或过度使用。 
如果我们看一眼ruby on rails文档,描述includes最重要的一句话是:

使用includes,Active Record确保使用最少的查询加载所有指定的连接。

换句话说,当我们的查询需要一个关联表时,两者(我们要查的表和关联表)都会被加载到内存中,这反过来在检索关联数据时会减少查询的次数。下边例子中我们检索companies表中所有关联person表的记录。

@companies = Company.includes(:persons).where(:persons => { active: true } ).all

@companies.each do |company|
     company.person.name
end

当我们迭代每个companies表中记录并且显示其对应person名字时,通常情况下,我们不得不每次使用单独的查询来检索person的名字。然而当我们使用includes方法时,渴望加载会把这关联的两个表加载到内存,所以这些代码只需要以此查询。很棒,对吧? 
所以当我们想检索所有关联person的companies数据,但是不想显示Person表中数据的时候,使用includes会发生什么呢?这时候加载关联的表就没有必要了,这时joins就开始闪光了!

@companies = Company.joins(:persons).where(:persons => { active: true } ).all

@companies.each do |company| 
     company.name 
end

joins方法使用懒加载(lazy loading)方式加载数据库,它只把Company表加载到内存,与Company表关联的Person表并不做要求(不加载到内存)。因此我们呢不会将冗余数据加载到内存中,虽然如果以后我们需要从同一数组变量使用Person表的数据,需要进一步的数据查询。

 

先放一些模型&关联在前面充当整个背景

class Author < ActiveRecord::Base 
  has_many :posts
end

class Post < ActiveRecord::Base  
  belongs_to :author  
  has_many :comments  
  has_many :tags
  has_many :guests, through: :comments
end 

class Comment < ActiveRecord::Base
  belongs_to :post
  has_one :guest
end 

class Guest < ActiveRecord::Base
  belongs_to :comment
end 

class Tag < ActiveRecord::Base
  belongs_to :post
end

Joins

  • rails 中 joins 产生的均是 inner join (取交集)
  • joins接收者可以是model类本身 也可以是 ActiveRecord::Relation 的实例
belongs_to
 Post.joins(:author)  #返回所有 有作者的 Post  INNER JOIN
SELECT "posts".* FROM "posts" 
  INNER JOIN "authors" ON "authors"."id" = "posts"."author_id"
has_many
author = Author.create
Author.joins(:posts)
Author.joins(:posts).where(posts: { author: author})

产生的sql分别如下 - 可以看出Rails在关联上的处理还是非常机智的

SELECT "authors".* FROM "authors" 
  INNER JOIN "posts" ON "posts"."author_id" = "authors"."id"

SELECT "authors".* FROM "authors" 
  INNER JOIN "posts" ON "posts"."author_id" = "authors"."id" 
  WHERE "posts"."author_id" = 1

先撸到这里 。老夫去看雪了。

  • joins也可以连接多个关联
# 下面两个操作结果等同
Post.joins(:author, :comments)
Post.joins(:author).joins(:comments)
SELECT "posts".* FROM "posts"
 INNER JOIN "authors" ON "authors"."id" = "posts"."author_id"
 INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"

这样的链式操作会导致很大的重复产生。inner join造成的 不介意可以uniq

嵌套连接 依旧是 链式操作
# comment has_one guest
Post.joins(comments: :guest)
SELECT "posts".* FROM "posts"
 INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
 INNER JOIN "guests" ON "guests"."comment_id" = "comments"."id"
# 几何级的重复
Author.joins(posts: [{ comments: :guest}, :tags])
SELECT "authors".* FROM "authors"
 INNER JOIN "posts" ON "posts"."author_id" = "authors"."id"
 INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
 INNER JOIN "guests" ON "guests"."comment_id" = "comments"."id"
 INNER JOIN "tags" ON "tags"."post_id" = "posts"."id"

 

Rails里面includes是比较常用的 依旧先码出背景

 

class LineItem < ActiveRecord::Base
  belongs_to :order, -> { includes :customer }
end 
class Order < ActiveRecord::Base
  belongs_to :customer
  has_many :line_items
end 
class Customer < ActiveRecord::Base
  has_many :orders, -> { includes :line_items }
end

关联 includes方法指定使用关联时要按需加载的间接关联

belongs_to

@line_item.order.customer 
如经常需要从 line_item上取到需要的 Customer 则这个 belongs_to :order 关联就有必要加上 includes

has_many & has_one

@customer.orders.line_items 
如果经常需要从 customer 上取到 line_items 则最后这个 has_many :orders 关联就有必要加上includes
没有加入includes时候的sql
2.3.0 :009 > l = LineItem.first
  LineItem Load (0.5ms)  SELECT  "line_items".* FROM "line_items"  ORDER BY "line_items"."id" ASC LIMIT 1
#<LineItem:0x007fe22f255a30> {
            :id => 1,
      :order_id => 1,
    :created_at => Fri, 22 Jan 2016 14:02:57 UTC +00:00,
    :updated_at => Fri, 22 Jan 2016 14:10:19 UTC +00:00
}
2.3.0 :010 > l.order
  Order Load (0.4ms)  SELECT  "orders".* FROM "orders" WHERE "orders"."id" = $1 LIMIT 1  [["id", 1]]
#<Order:0x007fe22f712488> {
             :id => 1,
    :customer_id => 1,
     :created_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00,
     :updated_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00
}
2.3.0 :011 > l.order.customer
  Customer Load (0.4ms)  SELECT  "customers".* FROM "customers" WHERE "customers"."id" = $1 LIMIT 1  [["id", 1]]
#<Customer:0x007fe22f23e718> {
            :id => 1,
    :created_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00,
    :updated_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00
}
加入includes之后的sql
l= LineItem.first
  LineItem Load (0.7ms)  SELECT  "line_items".* FROM "line_items"  ORDER BY "line_items"."id" ASC LIMIT 1
#<LineItem:0x007fc09e093650> {
            :id => 1,
      :order_id => 1,
    :created_at => Fri, 22 Jan 2016 14:02:57 UTC +00:00,
    :updated_at => Fri, 22 Jan 2016 14:10:19 UTC +00:00
}
2.3.0 :002 > l.order
  Order Load (0.3ms)  SELECT  "orders".* FROM "orders" WHERE "orders"."id" = $1 LIMIT 1  [["id", 1]]
  Customer Load (2.5ms)  SELECT "customers".* FROM "customers" WHERE "customers"."id" IN (1)
#<Order:0x007fc09dd030f8> {
             :id => 1,
    :customer_id => 1,
     :created_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00,
     :updated_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00
}
2.3.0 :003 > l.order.customer
#<Customer:0x007fc09e310b80> {
            :id => 1,
    :created_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00,
    :updated_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00
}

查询

N+1 问题
# IN (1,7) 是因为有关联的order的id是 1和7
line_items = LineItem.includes(:order).limit(10)
  LineItem Load (0.5ms)  SELECT  "line_items".* FROM "line_items" LIMIT 10
  Order Load (0.3ms)  SELECT "orders".* FROM "orders" WHERE "orders"."id" IN (1, 7)

line_items.each do |l| #现在这里不会有sql产生
  l.order
end
一次性加载多个关联 (可以看到belongs_to 和 has_many的区别)
Order.includes(:customer, :line_items)
  Order Load (0.3ms)  SELECT "orders".* FROM "orders"
  Customer Load (0.3ms)  SELECT "customers".* FROM "customers" WHERE "customers"."id" IN (1, 2)
  LineItem Load (0.3ms)  SELECT "line_items".* FROM "line_items" WHERE "line_items"."order_id" IN (1, 2, 4, 5, 6, 7, 3)
嵌套关联这里先不多说

includes的过滤(这里会牵扯到join)

  • left outer join
Order.includes(:line_items).where("line_items.id" => 5)
  SQL (0.5ms)  SELECT "orders"."id" AS t0_r0, "orders"."customer_id" AS t0_r1, "orders"."created_at" AS t0_r2, "orders"."updated_at" AS t0_r3, "line_items"."id" AS t1_r0, "line_items"."order_id" AS t1_r1, "line_items"."created_at" AS t1_r2, "line_items"."updated_at" AS t1_r3 FROM "orders"
 LEFT OUTER JOIN "line_items" ON "line_items"."order_id" = "orders"."id"
 WHERE "line_items"."id" = 5
[
    [0] #<Order:0x007fc09a9ec200> {
                 :id => 7,
        :customer_id => nil,
         :created_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00,
         :updated_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00
    }
]
  • inner join
Order.joins(:line_items).where("line_items.id > ?", 4).group("orders.id")
  Order Load (0.6ms)  SELECT "orders".* FROM "orders"
 INNER JOIN "line_items" ON "line_items"."order_id" = "orders"."id"
 WHERE (line_items.id > 4) GROUP BY orders.id
[
    [0] #<Order:0x007fc099e6e4f0> {
                 :id => 5,
        :customer_id => nil,
         :created_at => Fri, 22 Jan 2016 14:28:35 UTC +00:00,
         :updated_at => Fri, 22 Jan 2016 14:28:35 UTC +00:00
    },
    [1] #<Order:0x007fc099e6e2e8> {
                 :id => 7,
        :customer_id => nil,
         :created_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00,
         :updated_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00
    }
]

eager_load LEFT OUTER JOIN

has_many

ords =Order.eager_load(:line_items)
SELECT "orders"."id" AS t0_r0, "orders"."customer_id" AS t0_r1, "orders"."created_at" AS t0_r2, "orders"."updated_at" AS t0_r3, 
"line_items"."id" AS t1_r0, "line_items"."order_id" AS t1_r1, "line_items"."created_at" AS t1_r2, "line_items"."updated_at" AS t1_r3 
FROM "orders" LEFT OUTER JOIN "line_items" ON "line_items"."order_id" = "orders"."id"

可以看出 已经LEFF OUTER JOIN了 line_items 下面这个遍历是不会产生SQL查询的

ords.each do |order|
  order.line_items
end
o = Order.eager_load(:line_items).where("customer_id < ?", 5)
SELECT "orders"."id" AS t0_r0, "orders"."customer_id" AS t0_r1, "orders"."created_at" AS t0_r2, "orders"."updated_at" AS t0_r3,
   "line_items"."id" AS t1_r0, "line_items"."order_id" AS t1_r1, "line_items"."created_at" AS t1_r2, "line_items"."updated_at" AS t1_r3 
   FROM "orders" LEFT OUTER JOIN "line_items" ON "line_items"."order_id" = "orders"."id" WHERE (customer_id < 5)

belongs_to

也是提前加载 和has_many 类似 不过多陈述

 

 


参考链接:

返回列表