英文:
aggregation of aggregation not allowed
问题
我正在使用Google BigQuery。
我想要找到在特定时间段(5月1日至5月28日)内每周每天(星期日,星期一,星期二,星期三,星期四,星期五,星期六)的平均事件数量。
例如,在5月1日至5月28日之间有4个星期日,我想找出基于这4个星期日的星期日事件的平均数量。
以下是我的代码:
SELECT EXTRACT(DAYOFWEEK FROM TIMESTAMP_MICROS(event_timestamp)) AS day,
AVG(COUNT(*)) AS average_events
FROM `table`
WHERE DATE_TRUNC(EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp)), DAY)
BETWEEN DATETIME('2023-05-01 00:00:00') AND DATETIME('2023-05-28 00:00:00')
GROUP BY day
ORDER BY day
这会引发错误,类似于“不允许聚合的聚合”。
编辑:
我尝试了以下方法 -
SELECT day,
AVG(counts) AS average_events
FROM (
SELECT EXTRACT(DAYOFWEEK FROM event_timestamp) AS day,
COUNT(*) AS counts
FROM `table`
WHERE DATE_TRUNC(event_timestamp, DAY)
BETWEEN DATETIME('2023-05-01 00:00:00') AND DATETIME('2023-05-28 00:00:00')
GROUP BY day
) AS t
GROUP BY day
ORDER BY day
但在这里,计数与平均值相同。
英文:
I am using google bigquery.
I want to find the average number of events for each day of the week (Sunday,monday,tuesday,wednesday,thursday,friday,saturday) for a certain time period (1 to 28 May).
For example there were 4 Sundays between May 1 to May 28, I want to find out average number of events for Sunday based on those 4 Sundays.
Here is my code:
SELECT EXTRACT(DAYOFWEEK FROM TIMESTAMP_MICROS(event_timestamp)) AS day,
AVG(COUNT(*)) AS average_events
FROM `table`
WHERE DATE_TRUNC(EXTRACT(DATE FROM TIMESTAMP_MICROS(event_timestamp)), DAY)
BETWEEN DATETIME('2023-05-01 00:00:00') AND DATETIME('2023-05-28 00:00:00')
GROUP BY day
ORDER BY day
this gives an error like "Aggregations of aggregations are not allowed"
edit:
I have tried the following -
SELECT day,
AVG(counts) AS average_events
FROM (
SELECT EXTRACT(DAYOFWEEK FROM event_timestamp) AS day,
COUNT(*) AS counts
FROM `table`
WHERE DATE_TRUNC(event_timestamp, DAY)
BETWEEN DATETIME('2023-05-01 00:00:00') AND DATETIME('2023-05-28 00:00:00')
GROUP BY day
) AS t
GROUP BY day
ORDER BY day
but here, the count is the same as the average.
答案1
得分: 2
你想首先按天进行聚合,然后获取每个相关的计数 - 然后按星期几进行聚合。
我建议:
select event_dow, avg(cnt) avg_cnt
from (
select
timestamp_trunc(timestamp_micros(event_timestamp), day) event_day,
extract(dayofweek from timestamp_micros(event_timestamp)) event_dow,
count(*) cnt
from mytable
where event_timestamp >= unix_micros(timestamp '2023-05-01')
and event_timestamp < unix_micros(timestamp '2023-05-29')
group by event_day, event_dow
) t
group by event_dow -- dow aggregation
order by event_dow
请注意,我尝试通过直接筛选原始值来优化查询的where
子句(我们可以将边界转换为Unix时间戳) - 并使用半开区间。
英文:
You want to aggregate by day first, and get each associated count - then by day of the week.
I would recommend:
select event_dow, avg(cnt) avg_cnt
from (
select
timestamp_trunc(timestamp_micros(event_timestamp), day) event_day
extract(dayofweek from timestamp_micros(event_timestamp)) event_dow,
count(*) cnt
from mytable
where event_timestamp >= unix_micros(timestamp '2023-05-01')
and event_timestamp < unix_micros(timestamp '2023-05-29')
group by event_day, event_dow
) t
group by event_dow -- dow aggregation
order by event_dow
Note that I attempted to optimize the where
clause of the query by filtering directly against original values (we can convert the bounds to unix timestamps instead) - and using half-open intervals.
答案2
得分: 1
我没有访问GBQ的权限,所以你可能需要调整日期函数以适应BQ使用的函数,但以下内容应该可以显示如何编写自己的查询:
with daily_count as (
select trunc(event_timestamp, 'day') event_day
, count(1) count_day
from agg_data
group by event_day
)
select DAYNAME(event_day), avg(count_day)
from daily_count
group by DAYNAME(event_day)
;
英文:
I don't have access to GBQ so you'll probably need to adjust the date functions to those used by BQ, but the following should show you how to write your own query:
with daily_count as (
select trunc(event_timestamp, 'day') event_day
, count(1) count_day
from agg_data
group by event_day
)
select DAYNAME(event_day), avg(count_day)
from daily_count
group by DAYNAME(event_day)
;
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论