英文:
Average by month cumulatively, 2 variables
问题
我有两列数据:洗涤分数(wash_sc
)和农业分数(agric_sc
)。我想要累积求和并按月份分组获取它们的平均值。
目前,我能够为一个变量完成这个任务,尽管返回响应需要很长时间:
select date_format(s1.createdAt, '%b-%Y') as month,
(
select AVG(s2.wash_sc)
from checkins s2
where s2.createdAt <= last_day(s1.createdAt)
) as total
from checkins s1
group by month
order by s1.createdAt;
如何以更快的速度同时处理这两列数据?
英文:
I have two columns: wash score (wash_sc
) and agric score (agric_sc
). I want to get the average of both summing cumulatively and group by month.
Currently am able to accomplish this for one variable, though it is takes a long time to return the response:
select date_format(s1.createdAt,'%b-%Y') as month,
(
select AVG(s2.wash_sc)
from checkins s2
where s2.createdAt <= last_day(s1.createdAt)
) as total
from checkins s1
group by month
order by s1.createdAt;
How can I do this (faster!) for both columns?
答案1
得分: 2
根据您的确认,您想要实现一个运行平均值,根据您自己的查询,您需要基于运行总和除以运行计数来计算:
select year(createdAt) as `year`, month(createdAt) as `month`,
sum(sum(wash_sc)) over w / sum(count(wash_sc)) over w as cumul_avg_wash_sc,
sum(sum(agric_sc)) over w / sum(count(agric_sc)) over w as cumul_avg_agric_sc
from checkins
group by year(createdAt), month(createdAt)
window w as (order by year(createdAt), month(createdAt));
sum(sum(wash_sc)) over w
使用 SUM()
作为 窗口函数 来生成每个组内 SUM(wash_sc)
的累积总和。类似地,用于累积计数。
给定以下数据:
createdAt | wash_sc | agric_sc |
---|---|---|
2021-01-01 00:00:00 | 11 | 2 |
2022-01-02 00:00:00 | 9 | 4 |
2022-01-03 00:00:00 | 7 | 6 |
2022-01-04 00:00:00 | 5 | 8 |
2022-01-05 00:00:00 | 3 | 10 |
2022-01-06 00:00:00 | 1 | 12 |
它输出:
year | month | cumul_wash_sc | cumul_agric_sc |
---|---|---|---|
2021 | 1 | 11.0000 | 2.0000 |
2022 | 1 | 6.0000 | 7.0000 |
这里有一个 db<>fiddle 供您参考。
英文:
Based on your confirmation that you want to achieve a running average, as per your own query, you need to calculate based on the running sum divided by the running count:
select year(createdAt) as `year`, month(createdAt) as `month`,
sum(sum(wash_sc)) over w / sum(count(wash_sc)) over w as cumul_avg_wash_sc,
sum(sum(agric_sc)) over w / sum(count(agric_sc)) over w as cumul_avg_agric_sc
from checkins
group by year(createdAt), month(createdAt)
window w as (order by year(createdAt), month(createdAt));
The sum(sum(wash_sc)) over w
uses SUM()
as a window function to produce the cumulative sum of the inner SUM(wash_sc)
per group. Similarly for cumulative count.
Given the following data:
createdAt | wash_sc | agric_sc |
---|---|---|
2021-01-01 00:00:00 | 11 | 2 |
2022-01-02 00:00:00 | 9 | 4 |
2022-01-03 00:00:00 | 7 | 6 |
2022-01-04 00:00:00 | 5 | 8 |
2022-01-05 00:00:00 | 3 | 10 |
2022-01-06 00:00:00 | 1 | 12 |
It outputs:
year | month | cumul_wash_sc | cumul_agric_sc | ||
---|---|---|---|---|---|
2021 | 1 | 11.0000 | 2.0000 | ||
2022 | 1 | 6.0000 | 7.0000 |
Here is a db<>fiddle to play with.
答案2
得分: 1
SELECT
DATE_FORMAT(s1.createdAt, '%b-%Y') AS 月份,
(SELECT SUM(s2.wash_score) FROM checkins s2 WHERE MONTH(s2.createdAt) <= MONTH(s1.createdAt)) AS 累积洗涤分数,
(SELECT SUM(s3.agric_score) FROM checkins s3 WHERE MONTH(s3.createdAt) <= MONTH(s1.createdAt)) AS 累积农业分数
FROM
checkins s1
GROUP BY
MONTH(s1.createdAt)
ORDER BY
s1.createdAt;
英文:
SELECT
DATE_FORMAT(s1.createdAt, '%b-%Y') AS month,
(SELECT SUM(s2.wash_score) FROM checkins s2 WHERE MONTH(s2.createdAt) <= MONTH(s1.createdAt)) AS cumulative_wash_score,
(SELECT SUM(s3.agric_score) FROM checkins s3 WHERE MONTH(s3.createdAt) <= MONTH(s1.createdAt)) AS cumulative_agric_score
FROM
checkins s1
GROUP BY
MONTH(s1.createdAt)
ORDER BY
s1.createdAt;
答案3
得分: 1
使用窗口函数 AVG()
可以实现这个操作:
with cte as (
select *, AVG(wash_sc) over (partition by YEAR(createdAt), MONTH(createdAt) order by createdAt) as cumul_wash_sc,
AVG(agric_sc) over (partition by YEAR(createdAt), MONTH(createdAt) order by createdAt) as cumul_agric_sc
from checkins
)
select date_format(createdAt,'%b-%Y') as month, max(cumul_wash_sc) as cumulative_wash_score, max(cumul_agric_sc) as cumulative_agric_score
from cte
group by date_format(createdAt,'%b-%Y')
英文:
You can do it using the window function AVG()
:
with cte as (
select *, AVG(wash_sc) over (partition by YEAR(createdAt), MONTH(createdAt) order by createdAt) as cumul_wash_sc,
AVG(agric_sc) over (partition by YEAR(createdAt), MONTH(createdAt) order by createdAt) as cumul_agric_sc
from checkins
)
select date_format(createdAt,'%b-%Y') as month, max(cumul_wash_sc) as cumulative_wash_score, max(cumul_agric_sc) as cumulative_agric_score
from cte
group by date_format(createdAt,'%b-%Y')
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论