GROUP BY 在 BigQuery 中为什么会给我一个过高的行数?

huangapple go评论57阅读模式
英文:

Why does GROUP BY give me too high a row count in BigQuery

问题

I believe I am missing something (probably quite simple) in the use of GROUP BY in BigQuery, and I am hoping someone can set me straight.

比较这两个查询,我得到不同的用户数量

SELECT SUM(users) FROM (
    SELECT
        DATE,
        COUNT(DISTINCT user_id) AS users,
      FROM
        `mytable`
      WHERE
        DATE BETWEEN ('2022-05-01') AND ('2022-05-31')        
      GROUP BY
        DATE
)

users 的值约为:140,000

SELECT
   COUNT(DISTINCT user_id) AS users,
    FROM
      `mytable`
    WHERE
      DATE BETWEEN ('2022-05-01') AND ('2022-05-31')

users 的值约为:120,000

<details>
<summary>英文:</summary>

I believe I am missing something (probably quite simple) in the use of **GROUP BY** in BigQuery, and I am hoping someone can set me straight.  

Comparing these two queries I get different numbers of users

    SELECT SUM(users) FROM (
        SELECT
            DATE,
            COUNT(DISTINCT user_id) AS users,
          FROM
            `mytable`
          WHERE
            DATE BETWEEN (&#39;2022-05-01&#39;) AND (&#39;2022-05-31&#39;)        
          GROUP BY
            DATE
    )

value for users approx: 140000

    SELECT
       COUNT(DISTINCT user_id) AS users,
        FROM
          `mytable`
        WHERE
          DATE BETWEEN (&#39;2022-05-01&#39;) AND (&#39;2022-05-31&#39;)

value for users approx: 120000

</details>


# 答案1
**得分**: 2

在第二个查询中,您正在计算整个日期范围内不同的 user_id 值的数量。在第一个查询中,您正在计算范围内 *每一天* 的不同 user_id 值,然后将它们相加。在第一个查询中,可能会统计不同日期上重复的用户。

<details>
<summary>英文:</summary>

In the second query you&#39;re counting the distinct user_id values in the entire date range. In the first query you&#39;re counting the distinct user_id values *for each day* in the range, then summing those. There are probably duplicate users being counted on different days in the first query.

</details>



huangapple
  • 本文由 发表于 2023年6月13日 01:07:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76458874.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定