2023年2月16日 15:25:09go评论50阅读模式

英文:

MySQL join and group based on date ranges

问题

I have table A

uid    dt         val_A
10     04/09/2012   34
10     08/09/2012   35
10     10/09/2012   36
100    04/09/2012   40
100    08/09/2012   41

and table B

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3

I want to join them to get table C. I want to join them on uid. Furthermore I want to have a new column val_C which holds the average of val_B where date in B is greater or equal than the corresponding row-value dt in A AND less than the next higher dt value for this uid in table A. It means I want to aggregate the values in B based on date ranges defined in A. The joined table should look like this:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     0

How can this be achieved?

//EDIT
How could a more generalized solution look like where all dates in B2 which are greater than the greatest date in A are being joined & aggregated to the greatest date in A. B2:

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3
100   10/09/2012    4
100   11/09/2012    2

Desired output C2:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     3

英文:

I have table A

uid    dt         val_A
10     04/09/2012   34
10     08/09/2012   35
10     10/09/2012   36
100    04/09/2012   40
100    08/09/2012   41

and table B

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     0

How can this be achieved?

//EDIT
How could a more generalized solution look like where all dates in B2 which are greater than the greatest date in A are being joined & aggregated to the greatest date in A. B2:

uid   date        val_B
10    04/09/2012    1
10    05/09/2012    1
10    06/09/2012    2
10    07/09/2012    2
10    08/09/2012    1
100   07/09/2012    1
100   07/09/2012    3
100   10/09/2012    4
100   11/09/2012    2

Desired output C2:

uid    dt         val_A    val_C
10     04/09/2012   34     1.5
10     08/09/2012   35     1
10     10/09/2012   36     0
100    04/09/2012   40     2
100    08/09/2012   41     3

答案1

得分: 2

如果您使用支持LEAD()函数的MySQL v8+，可以尝试这样做：

WITH cte AS (
  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA)
SELECT cte.uid, cte.dt, cte.val_A,
       AVG(val_B) AS val_C
  FROM cte
LEFT JOIN tableB tb1
 ON cte.uid=tb1.uid
AND tb1.dt >= cte.dt
AND tb1.dt < cte.dtRg
GROUP BY cte.uid, cte.dt, cte.val_A

通用表达式（cte）中的查询：

  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA

将为您生成如下结果：

如您所见，dtRg列是使用LEAD()函数生成的，该函数根据ORDER BY获取下一行的dt值。在这里了解更多关于LEAD()的信息。

之后，将cte与tableB连接，匹配uid，并且tableB.dt与现有的tableA.dt相同或更大，即现在作为cte.dt，但低于cte.dtRg，即由LEAD()生成的tableA中的下一个日期。最后添加AVG(val_B) AS val_C。

演示fiddle

在较旧的MySQL版本上，您可以尝试这样做：

SELECT tA.uid, tA.dt, tA.val_A,
       AVG(val_B) AS val_C
   FROM 
(SELECT uid, dt, val_A,
       (SELECT dt FROM tableA ta1 
         WHERE ta1.uid=ta2.uid 
          AND ta1.dt > ta2.dt LIMIT 1) AS dtRg
  FROM tableA ta2) tA
LEFT JOIN tableB tB 
  ON tA.uid=tB.uid
AND tB.dt >= tA.dt
AND tB.dt < tA.dtRg
GROUP BY tA.uid, tA.dt, tA.val_A;

区别如下：

不使用LEAD()，而是在SELECT中使用相关子查询来获取相同uid中下一行的dt值。
不使用通用表达式，而是使用派生表。

MySQL v5.7版本的fiddle

英文:

If you're on MySQL v8+ that supports LEAD() function, then you can try this:

WITH cte AS (
  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA)
SELECT cte.uid, cte.dt, cte.val_A,
       AVG(val_B) AS val_C
  FROM cte
LEFT JOIN tableB tb1
 ON cte.uid=tb1.uid
AND tb1.dt &gt;= cte.dt
AND tb1.dt &lt; cte.dtRg
GROUP BY cte.uid, cte.dt, cte.val_A

The query in common table expression (cte):

  SELECT uid, dt, val_A,
       IFNULL(LEAD(dt) OVER (PARTITION BY uid ORDER BY uid, dt),dt) dtRg
FROM tableA

will give you a result like this:

As you can see, the dtRg column is generated using LEAD() function which takes the next row dt value according to the ORDER BY. Read more about LEAD() here.

After that, join the cte with tableB on matching uid and where tableB.dt is the same or bigger than the existing tableA.dt - which is now as cte.dt, but lower than cte.dtRg - which is the next date in tableA that was generated by LEAD(). And finally adding AVG(val_B) AS val_C

Demo fiddle

On older MySQL version, you can try this:

SELECT tA.uid, tA.dt, tA.val_A,
       AVG(val_B) AS val_C
   FROM 
(SELECT uid, dt, val_A,
       (SELECT dt FROM tableA ta1 
         WHERE ta1.uid=ta2.uid 
          AND ta1.dt &gt; ta2.dt LIMIT 1) AS dtRg
  FROM tableA ta2) tA
LEFT JOIN tableB tB 
  ON tA.uid=tB.uid
AND tB.dt &gt;= tA.dt
AND tB.dt &lt; tA.dtRg
GROUP BY tA.uid, tA.dt, tA.val_A;

The difference are as following:

Instead of using LEAD(), it uses correlated subquery in SELECT to get the next dt value of next row in the same uid.
Instead of common table expression, it uses a derived table.

Fiddle for MySQL v5.7 version

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

MySQL基于日期范围的连接和分组。

问题

答案1

使用Hibernate从多表连接查询中转换对象

Golang query multiple databases with a JOIN

MySQL 查询查找并替换表内的数值

FIND_IN_SET返回错误的结果

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论