英文:
SQL Server comparison operations in ON eg: on a>b, output not as expected
问题
我有一个关于使用SQL Server的问题,惃看是否有人能帮忙:
以下是我的查询:
select distinct a.*, b.*
from table_a a
left join
(select * from table_b where number_ in ('1','2')) b
on (a.id=b.id
and cast(a.date_ as date) >= cast('2021-12-01' as date))
然而,从我得到的结果来看,似乎a.date >= cast('2021-12-01' as date)
没有起作用,我注意到输出的日期值小于2021-12-01
。有人能帮我理解出了什么问题吗?
我使用以下单独语句进行了测试,并成功筛选了日期,不确定为什么在"on"子句中不起作用:
select * from table_a
where cast(date_ as date) >= cast('2021-12-01' as date)
table_a表:
id | date_ | amount |
---|---|---|
1 | 2019-07-16 11:59:09.000 | 20 |
2 | 2022-07-16 10:59:09.000 | 290 |
table_b表:
id | date_alert | number_ |
---|---|---|
1 | 2020-01-14 10:03:02.000 | 2 |
2 | 2020-01-14 10:05:02.000 | 2 |
我期望结果只包括与id 2相关的信息,因为它的日期大于2021-12-01
。
英文:
I have a question using sql server and wonder if anyone can help:
Below is my query:
select distinct a.*, b.*
from table_a a
left join
(select * from table_b where number_ in ('1','2') ) b
on (a.id=b.id
and cast(a.date_ as date)>= cast('2021_12_01' as date))
However,by looking at the results I got, it doesn't seem like a.date>= cast('2021_12_01' as date) was working, I noticed the output date has values smaller than 2021_12_01. Can anyone help me understand what went wrong?
I tested the individual statement using below and it successfully filtered the dates, not sure why it was not working in the "on"clause:
select * from table_a
where cast (date_ as date)>= cast('2021_12_01' as date)
table_a
id | date_ | amount |
---|---|---|
1 | 2019-07-16 11:59:09.000 | 20 |
2 | 2022-07-16 10:59:09.000 | 290 |
table_b
id | date_alert | number_ |
---|---|---|
1 | 2020-01-14 10:03:02.000 | 2 |
2 | 2020-01-14 10:05:02.000 | 2 |
I expect the result to only include id 2 related info since it's date_2 is larger than 2021/12/01
答案1
得分: 2
以下是翻译好的部分:
尝试这个:
```sql
SELECT DISTINCT a.*, b.*
FROM Table_A a
INNER JOIN Table_B b ON a.id = b.id AND b.number_ IN (1, 2)
WHERE a.date_ >= '20211201'
相对于问题的最大变化是将日期比较移到 WHERE
子句中。这基于以下信息:
我注意到输出日期的值小于 2021_12_01。
如果比较是 LEFT JOIN
的 ON
子句的一部分,那么可以预期将包括 Table_A
的每一行,即使日期小于所需的 2021_12_01
。如果要排除这些行,似乎是这种情况,您需要将条件移到 WHERE
子句中,或更改为 INNER JOIN
。 (我两者都做了,因为似乎这是预期的结果,但将其保留为 LEFT JOIN
仍然是一个选项,易于还原 - 只需更改一个词。)
其他更改:
- 直接将子查询转换为与表的连接,并添加了一个额外的连接条件。
- 将
'1','2'
文字转换为数字。我确实希望名为number_
的列不是定义为 varchar,因此此更改应有助于通过改善索引使用和减少数据转换来提高性能。 - 使用了正确的日期文字格式。
- 删除了
a.date_
上的cast
。再次,我确实希望名为date_
的列不是定义为 varchar!根据这一假设,还知道将datetime
强制转换为date
会截断任何时间部分,因此逻辑上知道比较任何datetime
值是否大于或等于午夜的给定日期将得到与比较截断版本的日期相同的结果,这意味着强制转换是不必要的。删除它可以通过减少转换操作并改善查询的索引使用显着提高性能。
<details>
<summary>英文:</summary>
Try this:
SELECT DISTINCT a., b.
FROM Table_A a
INNER JOIN Table_B b ON a.id = b.id AND b.number_ in (1,2)
WHERE a.date_ >= '20211201'
The biggest change relative to the question is moving the date comparison to the `WHERE` clause. It's based on this information:
> I noticed the output date has values smaller than 2021_12_01.
If the comparison is part of the `ON` clause for a `LEFT JOIN`, then **it's expected** that _every row of `Table_A` will be included_, even for dates smaller than the desired `2021_12_01`. If you want to exclude those rows, as seems the case, you need to either move the condition to the `WHERE` clause or change to an `INNER JOIN`. (I did both, because it seems like that's the expected result, but leaving this as a `LEFT JOIN` is still an option, and an easy change to revert &mdash; just change the one word.)
Other changes:
1. Converted the sub query to JOIN directly to the table with an additional JOIN condition
2. Converted the `'1','2'` literals to numbers. I _certainly hope_ a column named `number_` is _not defined as a varchar_, so this change should help improve performance by improving index use and reducing data conversions.
3. Used the [correct date literal format][1].
4. Removed the `cast` on `a.date_`. Again, I certainly hope a column named `date_` is _not defined as varchar!_ Given that assumption, and also knowing that casting a `datetime` to a `date` truncates any time portion, we can logically know that comparing whether any `datetime` value is greater than or equal to a given date at midnight will give **same result** as comparing the truncated version of the datetime, meaning the cast was _not necessary._ Removing it can **dramatically** improve performance by again reducing conversion operations and improving index use for the query.
[1]: https://blogs.msmvps.com/jcoehoorn/blog/2022/07/13/sql-and-dates/#Formats
</details>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论