JOIN with conditions vs WHERE clause

huangapple go评论59阅读模式
英文:

JOIN with conditions vs WHERE clause

问题

Query #1:

选择 t1.PK 
 table1 as t1 
加入 table2 as t2  t1.pk = t2.pk .... 
其中 t1.someattribute = somecondition

Query #2:

选择 t1.PK 
 table1 as t1 
加入 table2 as t2  (t1.pk = t2.pk 
                      并且 t1.someattribute = somecondition)

哪一个性能更好?我的理解是第二个更好,因为它在连接这两个表的同时选择要处理的行,因此它不会在内存中保留大量的行(取决于表的大小),而在处理 where 子句时会被丢弃,这样只会浪费内存。

但是当我测试它们时,我没有看到很大的差异。是否有人可以确认/否定在连接发生时添加 AND 是否会减少表处理的行数的前提是真实的?

英文:

I have two queries.

Query #1:

select t1.PK 
from table1 as t1 
join table2 as t2 on t1.pk = t2.pk .... 
where t1.someattribute = somecondition

Query #2:

select t1.PK 
from table1 as t1 
join table2 as t2 on (t1.pk = t2.pk 
                      and t1.someattribute = somecondition)

Which one would perform better? My understanding was that the second one was better because while it is joining the two tables it is selecting which rows it wants to work with and therefore it won't keep in memory a huge number of rows (depending on the size of the table) that when the where clause is processed will be discarded and only were kept in memory for nothing.

But when I test them I don't see a big difference. Could anybody confirm/deny if the premise that adding an AND while the join is happening reduces the number of rows that the table works with is true or not?

答案1

得分: 1

SQL是一种声明性语言。你通过SQL描述你想要什么,而接收SQL的系统决定如何为你获取它。换句话说,你不太关心它是如何获得结果的。因此,在几乎所有的关系数据库管理系统中,这两个语句是等效的。

在解析SQL之后,系统将执行优化步骤,以尽快执行。任何值得一提的关系数据库管理系统都会执行一个叫做“谓词推送”的优化步骤,在这个步骤中,它将取出WHERE子句中的条件,并确定在读取表时是否可以应用这些条件于数据选择。

如果选择了所有内容,然后在所有连接后才查看WHERE子句,那将有点愚蠢。总之,微软的目标不是不必要地消耗计算资源,或许是吧,我不在那里工作。

无论如何,我没有Azure实例来测试,但如果你想确认的话,你可以对这两个查询运行一个解释/执行计划,亲自查看是否有任何差异。很可能它们会相同。

英文:

SQL is a declarative language. You describe what you want via SQL, and the system to which you submit that SQL decides how to get it for you. In other words, you don't get much say about HOW it gets to that result. As such, in just about every RDBMS out there, these two statements are synonymous.

After parsing the sql, the system will perform optimization steps to execute as quickly as possible. Any RDBMS worth its salt will perform an optimization step called "predicate pushdown" in which it will take conditions in the WHERE clause and determine if it can apply them to data selection when it reads the tables.

It would be kind of silly to select EVERYTHING and only after shuffling the data together through all the joins finally look at the WHERE clause. At the end of the day Microsoft isn't in the cloud business to burn compute unnecessarily, or maybe it is. I don't work there.

At any rate, I don't have an Azure instance to test, but if you want to confirm you can run an explain/execution plan for both queries and see for yourself if there any differences. Likely they will be the same.

huangapple
  • 本文由 发表于 2023年6月22日 02:00:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/76526003.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定