2023年5月23日 00:26:47go评论108阅读模式

英文:

Calculating Overall Success Rate in SQL: Recursive CTE or Alternative Approach?

问题

以下是翻译的部分：

作为输入，我有3个字段：

Column_A：表示调查中的步骤按照时间顺序排列
Column_B：表示参与调查步骤的人数
Column_C：表示在特定调查步骤中成功的人数（B ≥ C）

现在我需要计算一个新的Column_D：这一列需要表示在所有先前步骤之后保持成功的参与者的总人数。因此，在每一步骤中，这个数字最好只能保持不变或者在不成功的参与者的情况下减少。

在Excel中，这相当容易，因为我会计算（除了第1行以外的行）：
前一行的Value_D -（Value_B - Value_C）。

我知道SQL中有lag和lead函数，但如何考虑同一列的前一行的计算呢？它是否能够工作？我需要编写递归的CTE之类的东西吗？

我认为一个大问题是：行数（Column_A）不相同，所以我不能简单地将多个CTE连接在一起。

Column_A	Column_B	Column_C	Column_D
0	35	35	35
1	35	35	35
2	35	34	34
3	34	33	33
4	33	30	30
5	33	31	28
6	33	33	28

我尝试了以下方法，但显然它返回了错误的结果，因为它没有考虑以前的计算。

SELECT [Column_A]
      ,[Column_B]
      ,[Column_C]
      , CASE WHEN [Column_A] = 0 
	         THEN [Column_B]
	         ELSE LAG([Column_C], 1, 0) OVER (PARTITION BY 1 
                                              ORDER     BY [Column_A] ASC) - ([Column_B] - [Column_C])
        END AS [Column_D]
FROM dataset

英文:

As input I have 3 fields:

Column_A: represents the step of the survey in a chronological order
Column_B: represents the number of persons, which participated in they survey step
Column_C: represents the number of persons, which were sucessfull in this specific survey step (B >= C)

Now I need to calculate a new Column_D: This column needs to represent the overall number of participants, which were remain successfull after all previous steps. So at best this number can only remain the same or being reduced after each step in case of an unsuccessfull participant.

In Excel this is kinda easy, because there i would calculate (except for row 1):
Value_D of former Row - (Value_B - Value_C).

I know of lag and lead in SQL, but how can I take into account the calculation of a former row of the same column? Does it work somehow? Do I need to code a recursive CTE or something?

I think big problem is this: The number of rows (Column_A) isnt the same, so i cant just glue several CTEs together.

Column_A	Column_B	Column_C	Column_D
0	35	35	35
1	35	35	35
2	35	34	34
3	34	33	33
4	33	30	30
5	33	31	28
6	33	33	28

I tried the following, but obviously it gives back the wrong results, because it doesnt take the former calculations in consideration.

SELECT [Column_A]
      ,[Column_B]
      ,[Column_C]
      , CASE WHEN [Column_A] = 0 
	         THEN [Column_B]
	         ELSE LAG([Column_C], 1, 0) OVER (PARTITION BY 1 
                                              ORDER     BY [Column_A] ASC) - ([Column_B] - [Column_C])
        END AS [Column_D]
FROM dataset

答案1

得分: 2

以下是翻译好的内容：

你可以结合两个窗口函数：

FIRST_VALUE，收集第一个 "Column_D" 的值
SUM，收集 "Column_B" 和 "Column_C" 之间的运行差值

然后从第一个窗口函数中减去第二个窗口函数。

SELECT *, FIRST_VALUE(Column_D) OVER(ORDER BY Column_A ROWS UNBOUNDED PRECEDING) - 
          SUM(Column_B - Column_C) OVER(ORDER BY Column_A ROWS UNBOUNDED PRECEDING) 
FROM tab

输出：

Column_A	Column_B	Column_C	Column_D	(无列名)
0	35	35	35	35
1	35	35	35	35
2	35	34	34	34
3	34	33	33	33
4	33	30	30	30
5	33	31	28	28
6	33	33	28	28

在此查看演示链接。

英文:

You can combine two window functions:

FIRST_VALUE, gathering the first "Column_D" value
SUM, gathering the running difference between "Column_B" and "Column_C"

then you subtract the second window function from the first.

SELECT *, FIRST_VALUE(Column_D) OVER(ORDER BY Column_A ROWS UNBOUNDED PRECEDING) - 
          SUM(Column_B - Column_C) OVER(ORDER BY Column_A ROWS UNBOUNDED PRECEDING) 
FROM tab

Output:

Column_A	Column_B	Column_C	Column_D	(No column name)
0	35	35	35	35
1	35	35	35	35
2	35	34	34	34
3	34	33	33	33
4	33	30	30	30
5	33	31	28	28
6	33	33	28	28

Check the demo here.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

计算SQL中的总体成功率：递归CTE还是替代方法？

问题

答案1

唯一约束组相似的值

将一个列分成两列，带有多个符号

获取具有相同 – 最新 – 时间戳的行

Mybatis批量插入Java中的BigDecimal到SQL Server的decimal(18,2)时无法正确工作

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。