Snowflake SQL的group-by行为会根据列是通过位置还是别名引用而有所不同。

huangapple go评论56阅读模式
英文:

Snowflake SQL group-by behaving differently depending whether columns are referenced by position or alias

问题

以下是翻译好的部分:

我正在尝试理解为什么在Snowflake中使用group by函数时,根据我如何引用group-by字段,会产生不同的结果。以下是两个查询,我相信它们应该产生相同的结果,但实际上却没有:

使用显式字段别名引用的查询:

select
    hash('SHA2_256', CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
    where 
        <一些筛选条件>
    group by hash, field1, field2, field3, field4;

使用位置引用字段的查询:

select
    hash('SHA2_256', CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from <table>
    where 
        <与上述相同的筛选条件>
    group by 1,2,3,4,5;

第一个查询产生了显著更多的记录,这可能表明它没有应用在第二个查询中应用的某个分组字段,但根据snowflake文档,我真的相信它们应该是相同的。这两者有何不同?

英文:

I am trying to understand why the group by function is yielding different results in snowflake depending on how I reference the group-by fields. Here are two Queries that I believe should yield the same result, but do NOT:

Query using explicit field alias references:

select
    hash(&#39;SHA2_256&#39;, CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from &lt;table&gt;
    where 
        &lt;some filters&gt;
    group by hash, field1, field2, field3, field4;

Query using positional references to fields:

select
    hash(&#39;SHA2_256&#39;, CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from &lt;table&gt;
    where 
        &lt;same filters as above&gt;
    group by 1,2,3,4,5;

The first query yields significantly more records, suggesting maybe it isn't applying a grouping field that is being applied in the second query, but based on the snowflake docs I really believe they should be the same. How are these two different?

答案1

得分: 2

以下是翻译的部分:

  1. The clue is that the aliased expression hash does not overshadow existing columns so:
    提示是,别名为hash的表达式不会遮盖现有的列,因此:

  2. is

  3. which is different than:
    与之不同的是:

英文:

The clue is that the aliased expression hash does not overshadow existing columns so:

select
     hash(&#39;SHA2_256&#39;, CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from &lt;table&gt;
where &lt;some filters&gt;
group by hash, field1, field2, field3, field4;

is

select
     hash(&#39;SHA2_256&#39;, CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from &lt;table&gt;
where &lt;some filters&gt;
group by &lt;table&gt;.hash, field1, field2, field3, field4;

which is different than:

select
     hash(&#39;SHA2_256&#39;, CONCAT(field1,field2,field3,field4)) as hash
    ,field1
    ,field2
    ,field3
    ,field4
    ,count(*) as count
from &lt;table&gt;
where &lt;same filters as above&gt;
group by 1,2,3,4,5;

huangapple
  • 本文由 发表于 2023年2月24日 05:29:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/75550501.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定