在组中跳过缺失值的结果行

huangapple go评论67阅读模式
英文:

Skip result row if value is missing in group

问题

我有一个类似于以下内容的Postgres表格:

日期时间 租户ID 今日订单数
2023-06-25 10:00 租户2 2
2023-06-25 10:00 租户1 1
2023-06-25 11:00 租户1 5
2023-06-25 11:00 租户2 2
2023-06-25 12:00 租户1 5

请注意,对于租户2,在时间12:00还没有生成今日订单数。

我使用以下查询来汇总今日订单数:

SELECT 日期时间, SUM(今日订单数)
FROM 订单
GROUP BY 日期时间

但这给我提供了以下结果:

日期时间 总数
2023-06-25 10:00 3
2023-06-25 11:00 7
2023-06-25 12:00 5

如何让它忽略时间为12:00的分组,其中租户2的计数缺失?如果可能的话,是否可以让它使用时间为11:00时租户2的先前值?

英文:

I've a Postgres table like this:

datetime tenant_id orders_today
2023-06-25 10:00 tenant2 2
2023-06-25 10:00 tenant1 1
2023-06-25 11:00 tenant1 5
2023-06-25 11:00 tenant2 2
2023-06-25 12:00 tenant1 5

Note that a orders_today for tenant2 hasn't yet been generated for time 12:00.

I use a query like this to summaries orders today:

SELECT datetime, SUM(orders_today)
FROM orders
GROUP BY datetime

But this gives me this result:

datetime sum
2023-06-25 10:00 3
2023-06-25 11:00 7
2023-06-25 12:00 5

How can I make it ignore the group for time 12 where a count for tenant 2 is missing? And, if possible, can I make it use the previous value for tenant 2 from time 11?

答案1

得分: 2

可以使用前一小时的值来为租户2从时间11点开始吗?

假设:

  • (datetime, tenant_id) 是主键。因此,它们是唯一的,并且两列都不能为空。
  • 您想要在给定时间范围内获取结果。
  • 您想要获取给定租户集合的结果(示例中为 'tenant1' 和 'tenant2')。
  • 在给定时间范围内,每个整点都期望有一行结果。
  • 对于缺失的行,使用相同租户在前一小时的值来替代。
  • 如果无法以这种方式获取每个租户的订单,就省略该行。

生成所有租户和时间戳的兴趣组合,左连接到您的表,然后继续:

SELECT datetime, sum(orders) AS sum_orders_today
FROM  (
   SELECT d.datetime, COALESCE(o.orders_today, o1.orders_today) AS orders
   FROM   generate_series(timestamp '2023-06-25 10:00'
                        , timestamp '2023-06-25 13:00'
                        , interval '1 hour') AS d(datetime)    -- 给定时间范围
   CROSS  JOIN (VALUES ('tenant1'), ('tenant2')) t(tenant_id)  -- 给定租户
   LEFT   JOIN orders o USING (tenant_id, datetime)            -- 常规数据
   LEFT   JOIN orders o1 ON o1.tenant_id = t.tenant_id         -- 替代值
                        AND o1.datetime  = d.datetime - interval '1 hour'
   ) sub
GROUP  BY 1
HAVING count(orders) = 2  -- 省略没有完整数据集的小时
ORDER  BY 1;

fiddle

类似:

如果我的假设不成立,您需要提供更具体的信息。

英文:

> can I make it use the previous value for tenant 2 from time 11?

Assuming:

  • (datetime, tenant_id) is the PRIMARY KEY. So: UNIQUE and bth columns NOT NULL.
  • You want results for a given time range.
  • You want results for a given set of tenants ('tenant1' and 'tenant2' in the example).
  • One row for every full hour in the time range is expected.
  • Substitute for missing rows with the value for the same tenant from the previous hour.
  • If we can't get orders for each tenant this way, omit the row.

Generate all combinations of interest from tenant & timestamp, LEFT JOIN to your table, LEFT JOIN to a possible substitute, then proceed:

SELECT datetime, sum(orders) AS sum_orders_today
FROM  (
   SELECT d.datetime, COALESCE(o.orders_today, o1.orders_today) AS orders
   FROM   generate_series(timestamp '2023-06-25 10:00'
                        , timestamp '2023-06-25 13:00'
                        , interval '1 hour') AS d(datetime)    -- given time range
   CROSS  JOIN (VALUES ('tenant1'), ('tenant2')) t(tenant_id)  -- given tenants
   LEFT   JOIN orders o USING (tenant_id, datetime)            -- regular data
   LEFT   JOIN orders o1 ON o1.tenant_id = t.tenant_id         -- substitutes
                        AND o1.datetime  = d.datetime - interval '1 hour'
   ) sub
GROUP  BY 1
HAVING count(orders) = 2  -- omit hours without full data set
ORDER  BY 1;

fiddle

Similar:

If my assumptions don't hold, you have to be more specific.

答案2

得分: 0

如果我理解正确,您想要在租户缺失的情况下排除一组日期时间,如果是这样,您可以使用having子句,并在其中设置条件,即每个日期时间的租户数量必须等于总租户数量:

select datetime, sum(orders_today)
from mytable
group by datetime
having count(distinct tenant_id) = (select count(distinct tenant_id) from mytable)
英文:

If I understand correctly, you want to exclude a group of datetimes if a tenant is missing, if so, you can use the having clause with the condition that the number of tenants for each datetime must equal the total number of tenants:

select datetime, sum(orders_today)
from mytable
group by datetime
having count(distinct tenant_id) = (select count(distinct tenant_id) from mytable)

huangapple
  • 本文由 发表于 2023年6月26日 03:46:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/76552163.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定