MySQL基于日期范围的连接和分组。

huangapple go评论40阅读模式
英文:

MySQL join and group based on date ranges

问题

I have table A

uid    dt         val_A
10     04/09/2012   34
10     08/09/2012   35
10     10/09/2012   36
100    04/09/2012   40
100    08/09/2012   41

and table B

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3

I want to join them to get table C. I want to join them on uid. Furthermore I want to have a new column val_C which holds the average of val_B where date in B is greater or equal than the corresponding row-value dt in A AND less than the next higher dt value for this uid in table A. It means I want to aggregate the values in B based on date ranges defined in A. The joined table should look like this:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     0

How can this be achieved?

//EDIT
How could a more generalized solution look like where all dates in B2 which are greater than the greatest date in A are being joined & aggregated to the greatest date in A. B2:

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3
100   10/09/2012    4
100   11/09/2012    2

Desired output C2:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     3
英文:

I have table A

uid    dt         val_A
10     04/09/2012   34
10     08/09/2012   35
10     10/09/2012   36
100    04/09/2012   40
100    08/09/2012   41

and table B

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3

I want to join them to get table C. I want to join them on uid. Furthermore I want to have a new column val_C which holds the average of val_B where date in B is greater or equal than the corresponding row-value dt in A AND less than the next higher dt value for this uid in table A. It means I want to aggregate the values in B based on date ranges defined in A. The joined table should look like this:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     0

How can this be achieved?

//EDIT
How could a more generalized solution look like where all dates in B2 which are greater than the greatest date in A are being joined & aggregated to the greatest date in A. B2:

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3
100   10/09/2012    4
100   11/09/2012    2

Desired output C2:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     3

答案1

得分: 2

如果您使用支持LEAD()函数的MySQL v8+,可以尝试这样做:

WITH cte AS (
  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA)
SELECT cte.uid, cte.dt, cte.val_A,
       AVG(val_B) AS val_C
  FROM cte
LEFT JOIN tableB tb1
 ON cte.uid=tb1.uid
AND tb1.dt >= cte.dt
AND tb1.dt < cte.dtRg
GROUP BY cte.uid, cte.dt, cte.val_A

通用表达式(cte)中的查询:

  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA

将为您生成如下结果:

MySQL基于日期范围的连接和分组。

如您所见,dtRg列是使用LEAD()函数生成的,该函数根据ORDER BY获取下一行的dt值。在这里了解更多关于LEAD()的信息

之后,将ctetableB连接,匹配uid,并且tableB.dt与现有的tableA.dt相同或更大,即现在作为cte.dt,但低于cte.dtRg,即由LEAD()生成的tableA中的下一个日期。最后添加AVG(val_B) AS val_C

演示fiddle

在较旧的MySQL版本上,您可以尝试这样做:

SELECT tA.uid, tA.dt, tA.val_A,
       AVG(val_B) AS val_C
   FROM 
(SELECT uid, dt, val_A,
       (SELECT dt FROM tableA ta1 
         WHERE ta1.uid=ta2.uid 
          AND ta1.dt > ta2.dt LIMIT 1) AS dtRg
  FROM tableA ta2) tA
LEFT JOIN tableB tB 
  ON tA.uid=tB.uid
AND tB.dt >= tA.dt
AND tB.dt < tA.dtRg
GROUP BY tA.uid, tA.dt, tA.val_A;

区别如下:

  1. 不使用LEAD(),而是在SELECT中使用相关子查询来获取相同uid中下一行的dt值。
  2. 不使用通用表达式,而是使用派生表。

MySQL v5.7版本的fiddle

英文:

If you're on MySQL v8+ that supports LEAD() function, then you can try this:

WITH cte AS (
  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA)
SELECT cte.uid, cte.dt, cte.val_A,
       AVG(val_B) AS val_C
  FROM cte
LEFT JOIN tableB tb1
 ON cte.uid=tb1.uid
AND tb1.dt &gt;= cte.dt
AND tb1.dt &lt; cte.dtRg
GROUP BY cte.uid, cte.dt, cte.val_A

The query in common table expression (cte):

  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA

will give you a result like this:

MySQL基于日期范围的连接和分组。

As you can see, the dtRg column is generated using LEAD() function which takes the next row dt value according to the ORDER BY. Read more about LEAD() here.

After that, join the cte with tableB on matching uid and where tableB.dt is the same or bigger than the existing tableA.dt - which is now as cte.dt, but lower than cte.dtRg - which is the next date in tableA that was generated by LEAD(). And finally adding AVG(val_B) AS val_C

Demo fiddle

On older MySQL version, you can try this:

SELECT tA.uid, tA.dt, tA.val_A,
       AVG(val_B) AS val_C
   FROM 
(SELECT uid, dt, val_A,
       (SELECT dt FROM tableA ta1 
         WHERE ta1.uid=ta2.uid 
          AND ta1.dt &gt; ta2.dt LIMIT 1) AS dtRg
  FROM tableA ta2) tA
LEFT JOIN tableB tB 
  ON tA.uid=tB.uid
AND tB.dt &gt;= tA.dt
AND tB.dt &lt; tA.dtRg
GROUP BY tA.uid, tA.dt, tA.val_A;

The difference are as following:

  1. Instead of using LEAD(), it uses correlated subquery in SELECT to get the next dt value of next row in the same uid.
  2. Instead of common table expression, it uses a derived table.

Fiddle for MySQL v5.7 version

huangapple
  • 本文由 发表于 2023年2月16日 15:25:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/75469004.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定