英文:
Skip result row if value is missing in group
问题
我有一个类似于以下内容的Postgres表格:
日期时间 | 租户ID | 今日订单数 |
---|---|---|
2023-06-25 10:00 | 租户2 | 2 |
2023-06-25 10:00 | 租户1 | 1 |
2023-06-25 11:00 | 租户1 | 5 |
2023-06-25 11:00 | 租户2 | 2 |
2023-06-25 12:00 | 租户1 | 5 |
请注意,对于租户2,在时间12:00还没有生成今日订单数。
我使用以下查询来汇总今日订单数:
SELECT 日期时间, SUM(今日订单数)
FROM 订单
GROUP BY 日期时间
但这给我提供了以下结果:
日期时间 | 总数 |
---|---|
2023-06-25 10:00 | 3 |
2023-06-25 11:00 | 7 |
2023-06-25 12:00 | 5 |
如何让它忽略时间为12:00的分组,其中租户2的计数缺失?如果可能的话,是否可以让它使用时间为11:00时租户2的先前值?
英文:
I've a Postgres table like this:
datetime | tenant_id | orders_today |
---|---|---|
2023-06-25 10:00 | tenant2 | 2 |
2023-06-25 10:00 | tenant1 | 1 |
2023-06-25 11:00 | tenant1 | 5 |
2023-06-25 11:00 | tenant2 | 2 |
2023-06-25 12:00 | tenant1 | 5 |
Note that a orders_today for tenant2 hasn't yet been generated for time 12:00.
I use a query like this to summaries orders today:
SELECT datetime, SUM(orders_today)
FROM orders
GROUP BY datetime
But this gives me this result:
datetime | sum |
---|---|
2023-06-25 10:00 | 3 |
2023-06-25 11:00 | 7 |
2023-06-25 12:00 | 5 |
How can I make it ignore the group for time 12 where a count for tenant 2 is missing? And, if possible, can I make it use the previous value for tenant 2 from time 11?
答案1
得分: 2
可以使用前一小时的值来为租户2从时间11点开始吗?
假设:
(datetime, tenant_id)
是主键。因此,它们是唯一的,并且两列都不能为空。- 您想要在给定时间范围内获取结果。
- 您想要获取给定租户集合的结果(示例中为 'tenant1' 和 'tenant2')。
- 在给定时间范围内,每个整点都期望有一行结果。
- 对于缺失的行,使用相同租户在前一小时的值来替代。
- 如果无法以这种方式获取每个租户的订单,就省略该行。
生成所有租户和时间戳的兴趣组合,左连接到您的表,然后继续:
SELECT datetime, sum(orders) AS sum_orders_today
FROM (
SELECT d.datetime, COALESCE(o.orders_today, o1.orders_today) AS orders
FROM generate_series(timestamp '2023-06-25 10:00'
, timestamp '2023-06-25 13:00'
, interval '1 hour') AS d(datetime) -- 给定时间范围
CROSS JOIN (VALUES ('tenant1'), ('tenant2')) t(tenant_id) -- 给定租户
LEFT JOIN orders o USING (tenant_id, datetime) -- 常规数据
LEFT JOIN orders o1 ON o1.tenant_id = t.tenant_id -- 替代值
AND o1.datetime = d.datetime - interval '1 hour'
) sub
GROUP BY 1
HAVING count(orders) = 2 -- 省略没有完整数据集的小时
ORDER BY 1;
类似:
- https://stackoverflow.com/questions/22567106/how-to-group-by-a-b-and-return-set-of-n-rows-of-b/22567596#22567596
- https://stackoverflow.com/questions/22443882/array-agg-group-by-and-null/22480876#22480876
如果我的假设不成立,您需要提供更具体的信息。
英文:
> can I make it use the previous value for tenant 2 from time 11?
Assuming:
(datetime, tenant_id)
is thePRIMARY KEY
. So:UNIQUE
and bth columnsNOT NULL
.- You want results for a given time range.
- You want results for a given set of tenants ('tenant1' and 'tenant2' in the example).
- One row for every full hour in the time range is expected.
- Substitute for missing rows with the value for the same tenant from the previous hour.
- If we can't get orders for each tenant this way, omit the row.
Generate all combinations of interest from tenant & timestamp, LEFT JOIN
to your table, LEFT JOIN
to a possible substitute, then proceed:
SELECT datetime, sum(orders) AS sum_orders_today
FROM (
SELECT d.datetime, COALESCE(o.orders_today, o1.orders_today) AS orders
FROM generate_series(timestamp '2023-06-25 10:00'
, timestamp '2023-06-25 13:00'
, interval '1 hour') AS d(datetime) -- given time range
CROSS JOIN (VALUES ('tenant1'), ('tenant2')) t(tenant_id) -- given tenants
LEFT JOIN orders o USING (tenant_id, datetime) -- regular data
LEFT JOIN orders o1 ON o1.tenant_id = t.tenant_id -- substitutes
AND o1.datetime = d.datetime - interval '1 hour'
) sub
GROUP BY 1
HAVING count(orders) = 2 -- omit hours without full data set
ORDER BY 1;
Similar:
- https://stackoverflow.com/questions/22567106/how-to-group-by-a-b-and-return-set-of-n-rows-of-b/22567596#22567596
- https://stackoverflow.com/questions/22443882/array-agg-group-by-and-null/22480876#22480876
If my assumptions don't hold, you have to be more specific.
答案2
得分: 0
如果我理解正确,您想要在租户缺失的情况下排除一组日期时间,如果是这样,您可以使用having
子句,并在其中设置条件,即每个日期时间的租户数量必须等于总租户数量:
select datetime, sum(orders_today)
from mytable
group by datetime
having count(distinct tenant_id) = (select count(distinct tenant_id) from mytable)
英文:
If I understand correctly, you want to exclude a group of datetimes if a tenant is missing, if so, you can use the having
clause with the condition that the number of tenants for each datetime must equal the total number of tenants:
select datetime, sum(orders_today)
from mytable
group by datetime
having count(distinct tenant_id) = (select count(distinct tenant_id) from mytable)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论