SQL中left join、right join、inner join的区别 | Rails joins & includes & eager_load
SQL中left join、right join、inner join的区别
三个join的含义:
- left join(左联接):返回左表中的所有记录以及和右表中的联接字段相等的记录。
- right join(右联接):返回右表中的所有记录以及和左表中的联接字段相等的记录。
- inner join(等值联接):只返回两个表中联接字段相等的记录。
下面将以atable、btable为例进行讲解。
atable:
tablea.png
btable:
tableB.png
left join
select * from atable as a
left join btable as b
on a.name = b.name;
结果如下:
leftjoin.png
可以看到,tableA join tableB的结果中,既有tableA中的所有记录,同时还包括了与右表中联接字段相等的记录,所以返回的记录总数一定是大于或等于tableA的记录总数。
right join的就不赘述了。
inner join
select * from atable as a
inner join btable as b
on a.name = b.name;
结果如下:
innerjoin.png
可以看到,tableA inner join tableB的结果中,只包含了tableA与tableB联接字段相等的记录。可以看作是tableA和tableB的交集。
Rails joins & includes & eager_load
includes和joins的不同
当includes和joins的时候最重要的概念就是他们有他们的典型用例。
includes使用贪婪加载(eager loading)而joins使用懒加载(lazy loading),两者都非常有用,但是也都很容易被滥用导致程序性能降低或过度使用。
如果我们看一眼ruby on rails文档,描述includes最重要的一句话是:
使用includes,Active Record确保使用最少的查询加载所有指定的连接。
换句话说,当我们的查询需要一个关联表时,两者(我们要查的表和关联表)都会被加载到内存中,这反过来在检索关联数据时会减少查询的次数。下边例子中我们检索companies表中所有关联person表的记录。
@companies = Company.includes(:persons).where(:persons => { active: true } ).all
@companies.each do |company|
company.person.name
end
当我们迭代每个companies表中记录并且显示其对应person名字时,通常情况下,我们不得不每次使用单独的查询来检索person的名字。然而当我们使用includes方法时,渴望加载会把这关联的两个表加载到内存,所以这些代码只需要以此查询。很棒,对吧?
所以当我们想检索所有关联person的companies数据,但是不想显示Person表中数据的时候,使用includes会发生什么呢?这时候加载关联的表就没有必要了,这时joins就开始闪光了!
@companies = Company.joins(:persons).where(:persons => { active: true } ).all
@companies.each do |company|
company.name
end
joins方法使用懒加载(lazy loading)方式加载数据库,它只把Company表加载到内存,与Company表关联的Person表并不做要求(不加载到内存)。因此我们呢不会将冗余数据加载到内存中,虽然如果以后我们需要从同一数组变量使用Person表的数据,需要进一步的数据查询。
先放一些模型&关联在前面充当整个背景
class Author < ActiveRecord::Base
has_many :posts
end
class Post < ActiveRecord::Base
belongs_to :author
has_many :comments
has_many :tags
has_many :guests, through: :comments
end
class Comment < ActiveRecord::Base
belongs_to :post
has_one :guest
end
class Guest < ActiveRecord::Base
belongs_to :comment
end
class Tag < ActiveRecord::Base
belongs_to :post
end
Joins
- rails 中 joins 产生的均是 inner join (取交集)
- joins接收者可以是model类本身 也可以是 ActiveRecord::Relation 的实例
belongs_to
Post.joins(:author) #返回所有 有作者的 Post INNER JOIN
SELECT "posts".* FROM "posts"
INNER JOIN "authors" ON "authors"."id" = "posts"."author_id"
has_many
author = Author.create
Author.joins(:posts)
Author.joins(:posts).where(posts: { author: author})
产生的sql分别如下 - 可以看出Rails在关联上的处理还是非常机智的
SELECT "authors".* FROM "authors"
INNER JOIN "posts" ON "posts"."author_id" = "authors"."id"
SELECT "authors".* FROM "authors"
INNER JOIN "posts" ON "posts"."author_id" = "authors"."id"
WHERE "posts"."author_id" = 1
先撸到这里 。老夫去看雪了。
- joins也可以连接多个关联
# 下面两个操作结果等同
Post.joins(:author, :comments)
Post.joins(:author).joins(:comments)
SELECT "posts".* FROM "posts"
INNER JOIN "authors" ON "authors"."id" = "posts"."author_id"
INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
这样的链式操作会导致很大的重复产生。inner join造成的 不介意可以uniq
嵌套连接 依旧是 链式操作
# comment has_one guest
Post.joins(comments: :guest)
SELECT "posts".* FROM "posts"
INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
INNER JOIN "guests" ON "guests"."comment_id" = "comments"."id"
# 几何级的重复
Author.joins(posts: [{ comments: :guest}, :tags])
SELECT "authors".* FROM "authors"
INNER JOIN "posts" ON "posts"."author_id" = "authors"."id"
INNER JOIN "comments" ON "comments"."post_id" = "posts"."id"
INNER JOIN "guests" ON "guests"."comment_id" = "comments"."id"
INNER JOIN "tags" ON "tags"."post_id" = "posts"."id"
Rails里面includes是比较常用的 依旧先码出背景
class LineItem < ActiveRecord::Base
belongs_to :order, -> { includes :customer }
end
class Order < ActiveRecord::Base
belongs_to :customer
has_many :line_items
end
class Customer < ActiveRecord::Base
has_many :orders, -> { includes :line_items }
end
关联 includes方法指定使用关联时要按需加载的间接关联
belongs_to
@line_item.order.customer
如经常需要从 line_item上取到需要的 Customer 则这个 belongs_to :order 关联就有必要加上 includes
has_many & has_one
@customer.orders.line_items
如果经常需要从 customer 上取到 line_items 则最后这个 has_many :orders 关联就有必要加上includes
没有加入includes时候的sql
2.3.0 :009 > l = LineItem.first
LineItem Load (0.5ms) SELECT "line_items".* FROM "line_items" ORDER BY "line_items"."id" ASC LIMIT 1
#<LineItem:0x007fe22f255a30> {
:id => 1,
:order_id => 1,
:created_at => Fri, 22 Jan 2016 14:02:57 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:10:19 UTC +00:00
}
2.3.0 :010 > l.order
Order Load (0.4ms) SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 LIMIT 1 [["id", 1]]
#<Order:0x007fe22f712488> {
:id => 1,
:customer_id => 1,
:created_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00
}
2.3.0 :011 > l.order.customer
Customer Load (0.4ms) SELECT "customers".* FROM "customers" WHERE "customers"."id" = $1 LIMIT 1 [["id", 1]]
#<Customer:0x007fe22f23e718> {
:id => 1,
:created_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00
}
加入includes之后的sql
l= LineItem.first
LineItem Load (0.7ms) SELECT "line_items".* FROM "line_items" ORDER BY "line_items"."id" ASC LIMIT 1
#<LineItem:0x007fc09e093650> {
:id => 1,
:order_id => 1,
:created_at => Fri, 22 Jan 2016 14:02:57 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:10:19 UTC +00:00
}
2.3.0 :002 > l.order
Order Load (0.3ms) SELECT "orders".* FROM "orders" WHERE "orders"."id" = $1 LIMIT 1 [["id", 1]]
Customer Load (2.5ms) SELECT "customers".* FROM "customers" WHERE "customers"."id" IN (1)
#<Order:0x007fc09dd030f8> {
:id => 1,
:customer_id => 1,
:created_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:07:43 UTC +00:00
}
2.3.0 :003 > l.order.customer
#<Customer:0x007fc09e310b80> {
:id => 1,
:created_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:04:22 UTC +00:00
}
查询
N+1 问题
# IN (1,7) 是因为有关联的order的id是 1和7
line_items = LineItem.includes(:order).limit(10)
LineItem Load (0.5ms) SELECT "line_items".* FROM "line_items" LIMIT 10
Order Load (0.3ms) SELECT "orders".* FROM "orders" WHERE "orders"."id" IN (1, 7)
line_items.each do |l| #现在这里不会有sql产生
l.order
end
一次性加载多个关联 (可以看到belongs_to 和 has_many的区别)
Order.includes(:customer, :line_items)
Order Load (0.3ms) SELECT "orders".* FROM "orders"
Customer Load (0.3ms) SELECT "customers".* FROM "customers" WHERE "customers"."id" IN (1, 2)
LineItem Load (0.3ms) SELECT "line_items".* FROM "line_items" WHERE "line_items"."order_id" IN (1, 2, 4, 5, 6, 7, 3)
嵌套关联这里先不多说
includes的过滤(这里会牵扯到join)
- left outer join
Order.includes(:line_items).where("line_items.id" => 5)
SQL (0.5ms) SELECT "orders"."id" AS t0_r0, "orders"."customer_id" AS t0_r1, "orders"."created_at" AS t0_r2, "orders"."updated_at" AS t0_r3, "line_items"."id" AS t1_r0, "line_items"."order_id" AS t1_r1, "line_items"."created_at" AS t1_r2, "line_items"."updated_at" AS t1_r3 FROM "orders"
LEFT OUTER JOIN "line_items" ON "line_items"."order_id" = "orders"."id"
WHERE "line_items"."id" = 5
[
[0] #<Order:0x007fc09a9ec200> {
:id => 7,
:customer_id => nil,
:created_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00
}
]
- inner join
Order.joins(:line_items).where("line_items.id > ?", 4).group("orders.id")
Order Load (0.6ms) SELECT "orders".* FROM "orders"
INNER JOIN "line_items" ON "line_items"."order_id" = "orders"."id"
WHERE (line_items.id > 4) GROUP BY orders.id
[
[0] #<Order:0x007fc099e6e4f0> {
:id => 5,
:customer_id => nil,
:created_at => Fri, 22 Jan 2016 14:28:35 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:28:35 UTC +00:00
},
[1] #<Order:0x007fc099e6e2e8> {
:id => 7,
:customer_id => nil,
:created_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00,
:updated_at => Fri, 22 Jan 2016 14:28:36 UTC +00:00
}
]
eager_load LEFT OUTER JOIN
has_many
ords =Order.eager_load(:line_items)
SELECT "orders"."id" AS t0_r0, "orders"."customer_id" AS t0_r1, "orders"."created_at" AS t0_r2, "orders"."updated_at" AS t0_r3,
"line_items"."id" AS t1_r0, "line_items"."order_id" AS t1_r1, "line_items"."created_at" AS t1_r2, "line_items"."updated_at" AS t1_r3
FROM "orders" LEFT OUTER JOIN "line_items" ON "line_items"."order_id" = "orders"."id"
可以看出 已经LEFF OUTER JOIN了 line_items 下面这个遍历是不会产生SQL查询的
ords.each do |order|
order.line_items
end
o = Order.eager_load(:line_items).where("customer_id < ?", 5)
SELECT "orders"."id" AS t0_r0, "orders"."customer_id" AS t0_r1, "orders"."created_at" AS t0_r2, "orders"."updated_at" AS t0_r3,
"line_items"."id" AS t1_r0, "line_items"."order_id" AS t1_r1, "line_items"."created_at" AS t1_r2, "line_items"."updated_at" AS t1_r3
FROM "orders" LEFT OUTER JOIN "line_items" ON "line_items"."order_id" = "orders"."id" WHERE (customer_id < 5)
belongs_to
也是提前加载 和has_many 类似 不过多陈述