Rails的`include`返回经过筛选的关系而不是所有关系。

huangapple go评论60阅读模式
英文:

Rails include returns filtered relations instead of all relations

问题

我使用了 includes 而不是 join,因为它运行得更快,但是该语句返回的关联不包括我正在寻找的所有数据。它返回所有左侧数据,但只返回与查询匹配的右侧数据。希望下面的示例可以帮助澄清问题以及我试图实现的目标。

join 似乎从数据和 Rails 关联的角度来看确实做到了我要的事情,但执行了大量的查询,速度较慢。

设置和示例

class Species < ActiveRecord::Base
    has_many :species_types, foreign_key: 'species_id', primary_key: "id"
end

class SpeciesTypes < ActiveRecord::Base
    belongs_to :species, :foreign_key => "id", :primary_key => "species_id"
end

create_table "species", force: :cascade do |t|
    t.bigint "id"
    t.string "identifier"
end

create_table "species_types", force: :cascade do |t|
    t.bigint "species_id"
    t.bigint "type_id"
    t.string "name"
end

用于可视化下面查询的表数据

Species

id identifier
1 furry
2 sleek
3 hairy
4 shiny
5 reflective
6 rough
7 rubbery

SpeciesTypes

species_id type_id identifier
1 1 hairy
1 2 metalic
2 3 skin
3 1 hairy
4 2 metalic
4 3 skin
5 3 skin
5 3 skin
6 2 metalic
7 2 metalic

我知道 SpeciesTypes.type_id,我想要获取所有具有该类型的 Species,包括它们所有的 SpeciesTypes

使用 includes

`species = Species.includes(:species_types).where(:species_types => {:type_id => 1})`

这确实返回了所有具有匹配 SpeciesTypeSpecies。但是,它不是返回所有 Species 以及它们的所有 SpeciesTypes,而是只返回与 :type_id 参数匹配的 SpeciesTypes。因此,在这种情况下,您无法从 Species 对象中引用所有 SpeciesTypesspecies[0].species_types)。虽然它不返回预期的内容,但它有道理为什么限制到匹配的 type_id

从上述查询的响应中获取的 Species
irb()$ species = Species.includes(:species_types).where(:species_types => {:type_id => 1})
irb()$ species[0].species_types

[#<SpeciesTypes:0x0000ffff9ad73490                       
 species_id: 1,
 type_id: 1,
 identifier: hairy>]
我希望得到的是这样的:
irb()$ species = Species.includes(:species_types).where(:species_types => {:type_id => 1})
irb()$ species[0].species_types

[#<SpeciesTypes:0x0000ffff9ad73490                       
 species_id: 1,
 type_id: 1,
 identifier: hairy>,
 <SpeciesTypes:0x0000ffff9ad73490                       
 species_id: 1,
 type_id: 2,
 identifier: metalic>,
 ]

使用 joins

这是返回我所期望的内容(使用 join 而不是 includes),但查询速度要慢得多。我认为我可能漏掉了一些明显的东西(或者不明显但根本的东西)。

species = Species.joins(:species_types).where(:species_types => {:type_id => 3})

上面的查询返回了我期望的值,但查询速度较慢。

是否可以更新 includes 查询以返回具有匹配已知 :type_id 的所有 Species 以及所有类型?

英文:

I'm using an includes instead of a join because it runs faster but the statement is returning an association that doesn't include all of the data I'm looking for. It returns all of the left data, but only the right data that matches the query. Hopefully the examples below help clarify the problem and what I'm trying to achieve.

The join does seem to do what I'm after from a data and rails association perspective but executes a ton of queries and is much slower.

Setup and examples

class Species < ActiveRecord::Base
    has_many :species_types, foreign_key: 'species_id', primary_key: "id"

end

class SpeciesTypes < ActiveRecord::Base
    belongs_to :species, :foreign_key => "id", :primary_key => "species_id"

end

create_table "species", force: :cascade do |t|
    t.bigint "id"
    t.string "identifier"
end

create_table "species_types", force: :cascade do |t|
    t.bigint "species_id"
    t.bigint "type_id"
    t.string "name"
end

Table data to help visualize the queries below

Species

id identifier
1 furry
2 sleek
3 hairy
4 shiny
5 reflective
6 rough
7 rubbery

SpeciesTypes

species_id type_id identifier
1 1 hairy
1 2 metalic
2 3 skin
3 1 hairy
4 2 metalic
4 3 skin
5 3 skin
5 3 skin
6 2 metalic
7 2 metalic

I know the SpeciesTypes.type_id, and I'm looking to get all Species that have that type, including all of their SpeciesTypes.

Using includes

`species = Species.includes(:species_types).where(:species_types => {:type_id => 1})`

This does return all Species with a matching SpeciesType. However, instead of returning all Species with all SpeciesType it return all Species with only the SpeciesType that match the :type_id parameter. So, in this case you cannot reference all SpeciesTypes from the Species object (species[0].species_types). Does not return what was expected, although it makes sense why it does limit to the matched type_id.

Response from above query for Species
irb()$ species = Species.includes(:species_types).where(:species_types => {:type_id => 1})
irb()$ species[0].species_types

[#<SpeciesTypes:0x0000ffff9ad73490                       
 species_id: 1,
 type_id: 1,
 identifier: hairy>]
I'm looking for this:
irb()$ species = Species.includes(:species_types).where(:species_types => {:type_id => 1})
irb()$ species[0].species_types

[#<SpeciesTypes:0x0000ffff9ad73490                       
 species_id: 1,
 type_id: 1,
 identifier: hairy>,
 <SpeciesTypes:0x0000ffff9ad73490                       
 species_id: 1,
 type_id: 2,
 identifier: metalic>,
 ]

Using joins

This is returning what I'm after (using join instead of includes) however the query is much much slower. I think I'm missing something obvious (or not obvious but fundamental)

species = Species.joins(:species_types).where(:species_types => {:type_id => 3})

The above returns the values that I expect but is a much slower query.

Can the includes query be updated to return all Species with all types that match the known :type_id?

答案1

得分: 1

以下是翻译好的部分:

"While its pretty natural to think that Species.includes(:species_types).where(:species_types => {:type_id => 3}) would load all the species and just eager load the species_types that match the where clause thats not how ActiveRecord and SQL actually works."

尽管认为Species.includes(:species_types).where(:species_types => {:type_id => 3})会加载所有物种并仅急切加载与where子句匹配的species_types,但这不是ActiveRecord和SQL的实际工作方式。

"What this generates in terms of a query something like:"

关于查询生成的内容如下:

"SELECT species.name AS t0_c1, species_types.id AS t1_c1 ..."
"LEFT OUTER JOIN species_types, t1"
"ON species_types.specie_id = species.id"
"WHERE species_types.type_id = ?"

当您使用includes并引用其他表时,它会委托给.eager_load,这将在单个数据库查询中加载两个表。

在这里,where子句应用于整个查询,而不仅仅是连接的关联。请记住,此查询对每个species_types行返回一行(其中物种表的数据重复)。

如果您想要仅加载与条件匹配的记录,您需要将限制条件放入JOIN子句中:

"SELECT species.name AS t0_c1, ..."
"LEFT OUTER JOIN species_types, t1"
"ON species_types.specie_id = species.id AND species_types.type_id = ?"

不幸的是,ActiveRecord关联不提供这样做的方法。

解决问题的最简单方法很可能是从另一端查询:

"Type.find(1)"
".specie_types.where(specie: specie)"

".joins"不是答案

您不能只用joins替换includes,因为它们执行的操作不同。

joins只是向查询添加INNER LEFT JOIN,但实际上不会从连接的表中选择任何列。它用于基于连接的表过滤关联或选择聚合,而不是防止N+1查询。

在这种情况下,最有可能的是第一个查询本身不太慢,而是在迭代specie_types时创建了N+1查询,因为关联没有急切加载/预加载。

includes执行的是OUTER LEFT JOIN,将加载相关记录,具体取决于如何使用它,可以使用一次或两次查询。

英文:

While its pretty natural to think that Species.includes(:species_types).where(:species_types => {:type_id => 3}) would load all the species and just eager load the species_types that match the where clause thats not how ActiveRecord and SQL actually works.

What this generates in terms of a query something like:

SELECT species.name AS t0_c1, species_types.id AS t1_c1 ...
LEFT OUTER JOIN species_types, t1
  ON species_types.specie_id = species.id
WHERE species_types.type_id = ?

When you use includes and reference the other table it delegates to .eager_load which loads both tables in a single database query.

The where clause here applies to the entire query and not just the joined assocation. Remember that this query returns a row for every species_types row (with duplicate data for the species table).

If you wanted to load just the records that match the condition you would need to put the restriction into the JOIN clause:

SELECT species.name AS t0_c1, ...
LEFT OUTER JOIN species_types, t1
  ON species_types.specie_id = species.id AND species_types.type_id = ?

Unfortunately ActiveRecord associations do not provide a way to do that.

The easiest solution to the problem is most likely to just query from the other end:

Type.find(1)
    .specie_types.where(specie: specie)

.joins is not the answer

You can't just replace includes with joins as they do very things.

joins just adds an INNER LEFT JOIN to the query but doesn't actually select any columns from the joined table. Its used to filter the assocation based on the joined table or to select aggregates. Not to prevent N+1 queries.

In this case it's most likely not the first query itself thats slower - rather you're creating a N+1 query when you iterate through specie_types as the assocation is not eager loaded / preloaded.

includes does an OUTER LEFT JOIN and will load the assocatiated records either in one or two queries depending on how its used.

huangapple
  • 本文由 发表于 2023年2月8日 17:02:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/75383373.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定