英文:
Why combining AND & OR operators at the same level of a WHERE clause cause partitions not to be pruned
问题
我有一个以 date_hit
字段分区的 BQ 表,该字段是 DATE
类型。今天,我与之合作的分析师之一发送给我这个查询,称分区未被修剪:
SELECT
* -- 简化示例
FROM mytable
WHERE
date_hit = "2022-10-28"
AND event_label LIKE "%entrar%"
OR event_label LIKE "%aderir %"
事实上,这个查询检索了32TB的数据,即使我们明确指定了要检索的日期,分区也未被修剪。阅读查询后,我猜想她真正想编写的是:
SELECT
* -- 简化示例
FROM mytable
WHERE
date_hit = "2022-10-28"
AND (
event_label LIKE "%entrar%"
OR event_label LIKE "%aderir %"
)
通过应用这个更改,分区被正确修剪,查询仅检索了47GB的数据。
我想要理解的是,第一个查询发生了什么?为什么它要查询这么多数据?
英文:
I have a BQ table which is partitioned by the date_hit
field, which is a DATE
type. Today, one of the analysts I work with sent me this query saying that partitions where not being pruned:
SELECT
* -- simplified for the example
FROM mytable
WHERE
date_hit = "2022-10-28"
AND event_label LIKE "%entrar%"
OR event_label LIKE "%aderir %"
Indeed, this query was retrieving 32TB of data, so partitions where not being pruned even if we were specifying the exact date we wanted to retrieve. After reading the query, I guessed that what she was really trying to write was:
SELECT
* -- simplified for the example
FROM mytable
WHERE
date_hit = "2022-10-28"
AND (
event_label LIKE "%entrar%"
OR event_label LIKE "%aderir %"
)
and by applying this change the partitions were pruned properly, and the query only retrieved 47GB.
What I'm trying to understand is, what was happening with the first query? Why was it querying so much data?
答案1
得分: 1
在SQL(以及大多数编程语言中),AND
的优先级高于 OR
。因此,原始的 WHERE
子句被解释为:
WHERE (date_hit = "2022-10-28" AND event_label LIKE "%entrar%") OR
event_label LIKE "%aderir %"
如果您打算使用第二个版本,那么您必须使用明确的括号:
WHERE date_hit = "2022-10-28" AND
(event_label LIKE "%entrar%" OR event_label LIKE "%aderir %")
英文:
In SQL (and most progamming languages), AND
has higher precedence than OR
. So the original WHERE
clause was being evaluated as this:
<!-- language: sql -->
WHERE (date_hit = "2022-10-28" AND event_label LIKE "%entrar%") OR
event_label LIKE "%aderir %"
If you intend to use the second version, then you must use explicit parentheses:
<!-- language: sql -->
WHERE date_hit = "2022-10-28" AND
(event_label LIKE "%entrar%" OR event_label LIKE "%aderir %")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论