2023年6月12日 02:58:14go评论66阅读模式

英文:

Why move where clause to join clause produce different result

问题

在SQL Server中，我有两个大表，每个表约有5亿条记录，我期望这两个查询会产生相同的结果，但实际情况不是这样的！

select count(*)
from ff
left join d on
ff.ID = d.id
where ff.sendtime < '2023-01-21' and d.id is null

select count(*)
from ff
left join d on
ff.sendtime < '2023-01-21' and ff.ID = d.id
where d.id is null

英文:

In SQl server I have two big tables with each one about 500,000,000 records,
I expect that these two query produce same result, but not!

select count(*)
from ff
         left join d on
    ff.ID = d.id
where ff.sendtime &lt; &#39;2023-01-21&#39; and d.id is null

select count(*)
from ff
         left join d on
    ff.sendtime &lt; &#39;2023-01-21&#39; and ff.ID = d.id
where d.id is null

答案1

得分: 2

First query

选择所有在2023-01-21之前的ff行。然后根据ID与所有d行进行外连接。然后排除所有在dd中有匹配的ff行。因此，您选择了所有在2023-01-21之前的ff行，其中不存在d行。

Second query

选择所有ff行，不管日期如何。根据ID和ff日期外连接所有d行。然后排除所有在dd中有匹配的ff行。因此，您选择了所有ff行，其senddate大于等于2023-01-21或没有与d匹配的行。

Anti joins

反连接，其中您对一个表进行外连接并仅保留外连接的行，用于选择在该表中没有匹配数据的数据。我认为“NOT EXISTS”或“NOT IN”比连接更易读。这些子句通过将感兴趣的表放在“FROM”子句中，将其他表的查找放在“WHERE”子句中，明确了要选择的数据以及选择数据的条件。

这是我如何编写第一个查询：

select count(*)
from ff
where sendtime < '2023-01-21'
and not exists (select * from d where d.id = ff.id);

这是我如何编写第二个查询：

select count(*)
from ff
where sendtime >= '2023-01-21'
or not exists (select * from d where d.id = ff.id);

英文:

First query

Select all ff rows before 2023-01-21. Then outer join all d rows on the ID. Then dismiss all rows where ff had a match in dd. Thus you select all ff rows before 2023-01-21 for which not exists a d row.

Second query

Select all ff rows, no matter which date. Outer join all d rows based on the ID and the ff date. Then dismiss all rows where ff had a match in dd. Thus you select all ff rows the senddate of which is >= 2023-01-21 or that don't have a match in d.

Anti joins

Anti joins, where you outer join a table and only keep the outer joined rows, are meant to select data where there is no matching data in that table. I consider NOT EXISTS or NOT IN way more readable than the join. These clauses make plain what you want to select and what the condition is for selecting the data, by putting the table of interest in the FROM clause and the lookup of the other table in the WHERE clause.

This is how I'd write the first query:

select count(*)
from ff
where sendtime &lt; &#39;2023-01-21&#39;
and not exists (select * from d where d.id = ff.id);

This is how I'd write the second query:

select count(*)
from ff
where sendtime &gt;= &#39;2023-01-21&#39;
or not exists (select * from d where d.id = ff.id);

答案2

得分: 0

第一个查询中，基于“ID”列执行了表ff和d之间的左连接，而在第二个查询中，则检查“ff.sendtime”是否小于给定时间，即2023-01-21，并且“ID”在两个表中是否匹配。

英文:

In the first query the left join between tables ff and d is performed based on the "ID" column where as in the second it checks "ff.sendtime" is less than given time i.e.., 2023-01-21 and "ID" is matching in two tables or not.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

为什么将“where”子句移到“join”子句会产生不同的结果。

问题

答案1

First query

Second query

Anti joins

First query

Second query

Anti joins

答案2

Golang和MySQL – 管理连接

统计属于同一客户的产品组合数量。

如何在PostgreSQL中正确创建JSON？

如何在Oracle SQL中将带有分隔符的数字序列转换为二进制？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论