Oracle 4周滚动平均 – 缺失值必须被视为0。

huangapple go评论128阅读模式
英文:

Oracle 4 weeks running average - missing values must imply 0

问题

我试图创建一个数据的4周移动平均,并将其旋转以用于数据透视表报告。我基本上正在将一个电子表格作为数据库以及其报告迁移,我刚刚开始意识到Excel有多么松散和宽容。

目标是获取所有RC的最近4周平均值,即使它们在该周没有数据也需要显示0的滚动平均值。

  • 因此,如果FOO在过去4周没有数据,它仍需要显示0的滚动平均值。
  • 如果在过去4周的任何部分没有数据,那个值必须被视为0处理。

以下是一些经过处理的示例数据,其中包含3周的数据。我提供了2周的输出示例,以便那些想要尝试的人还有一个神秘的周来验证他们的查询。

CREATE TABLE MY_TABLE
(
    cy               int          NOT NULL,
    week             int          NOT NULL,
    RC               VARCHAR2(10) NOT NULL,
    duration_minutes number       Not NULL
);

INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'HCM', 86);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'ERP', 12);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'PBI', 16);
-- 省略部分数据
-- commit;

这个查询不够理想,因为我不得不列出所有的周。我需要一个可以跨所有数据执行此操作的查询。也许可以使用where CY_WEEK between MIN(CY_WEEK) and MAX(CY_WEEK)这样的子句,但这仍然不能处理丢失的数据。

select * from (
    select '2022_10' as CY_WEEK, RC, SUM(s) / 4 as rolling_avg
    from (
           select RC, SUM(DURATION_MINUTES) as s
           from MY_TABLE where CY = 2022 and WEEK between 07 and 10
           group by cy,week, RC order by  RC
    ) group by RC union all

    select '2022_11' as CY_WEEK, RC, SUM(s) / 4 as rolling_avg
    from (
           select RC, SUM(DURATION_MINUTES) as s
           from MY_TABLE where CY = 2022 and WEEK between 08 and 11
           group by cy,week, RC order by  RC
    ) group by RC

) order by CY_WEEK, RC;

当我完成后,非旋转的数据看起来像这样(理想情况下:每周应该有1行,包括所有RC的4周滚动平均值,即使它们没有数据)。我添加了comment字段只是为了阐述问题,尽管它不在查询中:

CY_WEEK RC ROLLING_AVG comment
2022_10 HCM 269.75
2022_10 PALPHABET 32
2022_10 PBI 245.25
2022_10 SCM 0 该周没有SCM数据
2022_10 PBU 52.25
2022_10 PDT 23
2022_10 PIO 187.5
2022_10 POTHER 142.25
2022_10 PWC 142
2022_10 TBD 85.75
2022_11 HCM 182.25
2022_11 PALPHABET 32
2022_11 PBI 198.5
2022_11 PBU 112.25
2022_11 PDT 23
2022_11 PIO 191.5
2022_11 POTHER 130
2022_11 PWC 142
2022_11 SCM 24.25
2022_11 TBD 84.75

理想格式:

CY_WEEK HCM PALPHABET PBI SCM PBU PDT PIO POTHER PWC TBD
2022_11 269.75 32 245.25 0 52.25 23 187.5 142.25 142 85.75
2022_11 182.25 32 198.5 24.25 112.25 23 191.5 130 142 85.75
英文:

I am trying to create a 4 week running average of data, and then pivot it for a pivot table report. I am basically porting a spreadsheet-as-a-DB and it's reports and I am just starting to realize how sloppy and forgiving excel is.

The goal is to take the running 4 week average, of all the RCs (even if they have no data for that week).

  • So, if FOO had no data for the last 4 weeks, it needs to still show a rolling average of 0
  • If there was no data for any part of the last 4 weeks, that value must be treated as 0 for that RC

Below is some sanitized, sample data which contains 3 weeks of data. I've supplied sample of output for 2 weeks, so those who wanna take a stab at this still have a mystery week to prove out their query.

CREATE TABLE MY_TABLE
(
    cy               int          NOT NULL,
    week             int          NOT NULL,
    RC               VARCHAR2(10) NOT NULL,
    duration_minutes number       Not NULL
);

INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'HCM', 86);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'ERP', 12);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'PBI', 16);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'PBI', 75);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'ERP', 24);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'ERP', 48);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 06, 'CRM', 25);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'HCM', 43);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'TBD', 4);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'POTHER', 9);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'POTHER', 14);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'POTHER', 26);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'HCM', 240);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'PBI', 187);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 07, 'HCM', 67);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'PBI', 491);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'PBI', 192);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'PBI', 57);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'PWC', 499);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'TBD', 105);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'HCM', 37);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'POTHER', 380);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'TBD', 46);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'POTHER', 140);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 08, 'HCM', 116);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 09, 'PWC', 69);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 09, 'PBI', 54);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 09, 'PALPHABET', 128);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 09, 'HCM', 172);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 09, 'HCM', 96);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 09, 'PDT', 92);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 350);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'TBD', 8);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'TBD', 180);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 13);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 25);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 37);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'HCM', 184);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'HCM', 59);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 24);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 59);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'HCM', 65);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PBU', 209);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 10, 'PIO', 242);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 11, 'SCM', 97);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 11, 'PBU', 240);
INSERT INTO MY_TABLE(CY, WEEK, RC, DURATION_MINUTES) VALUES (2022, 11, 'PIO', 16);

commit;

This query is not ideal because I have to spell out all of the weeks. I need a query that just does this across all data. Maybe a clause of where CY_WEEK between MIN(CY_WEEK) and MAX(CY_WEEK), but this would still not account for missing data.

select * from (
    select '2022_10' as CY_WEEK, RC, SUM(s) / 4 as rolling_avg
    from (
           select RC, SUM(DURATION_MINUTES) as s
           from MY_TABLE where CY = 2022 and WEEK between 07 and 10
           group by cy,week, RC order by  RC
    ) group by RC union all

    select '2022_11' as CY_WEEK, RC, SUM(s) / 4 as rolling_avg
    from (
           select RC, SUM(DURATION_MINUTES) as s
           from MY_TABLE where CY = 2022 and WEEK between 08 and 11
           group by cy,week, RC order by  RC
    ) group by RC

) order by CY_WEEK, RC;

The unpivoted data, when I am done, looks like this (Ideally: every week would have 1 line, with the 4 week rolling average, for all RCs, even if they had no data). I've added the comment field just to articulate the issue, thought it's not in the query:

CY_WEEK RC ROLLING_AVG comment
2022_10 HCM 269.75
2022_10 PALPHABET 32
2022_10 PBI 245.25
2022_10 SCM 0 no scm data for the week
2022_10 PBU 52.25
2022_10 PDT 23
2022_10 PIO 187.5
2022_10 POTHER 142.25
2022_10 PWC 142
2022_10 TBD 85.75
2022_11 HCM 182.25
2022_11 PALPHABET 32
2022_11 PBI 198.5
2022_11 PBU 112.25
2022_11 PDT 23
2022_11 PIO 191.5
2022_11 POTHER 130
2022_11 PWC 142
2022_11 SCM 24.25
2022_11 TBD 84.75

Ideal format:

CY_WEEK HCM PALPHABET PBI SCM PBU PDT PIO POTHER PWC TBD
2022_11 269.75 32 245.25 0 52.25 23 187.5 142.25 142 85.75
2022_11 182.25 32 198.5 24.25 112.25 23 191.5 130 142 85.75

There are a lot of RCs. It would be great to not have to list them all in the query

答案1

得分: 2

以下是您要的翻译:

您可以使用PARTITIONed OUTER JOIN来确保每个rc/cy/week组合都有一行,然后使用分析函数SUM来获取过去4周的累积总数,然后进行PIVOT操作:

WITH rcs (rc) AS (
SELECT 'HCM' FROM DUAL UNION ALL
SELECT 'PALPHABET' FROM DUAL UNION ALL
SELECT 'PBI' FROM DUAL UNION ALL
SELECT 'SCM' FROM DUAL UNION ALL
SELECT 'PBU' FROM DUAL UNION ALL
SELECT 'PDT' FROM DUAL UNION ALL
SELECT 'PIO' FROM DUAL UNION ALL
SELECT 'POTHER' FROM DUAL UNION ALL
SELECT 'PWC' FROM DUAL UNION ALL
SELECT 'TBD' FROM DUAL
)
SELECT *
FROM   (
SELECT cy,
week,
r.rc,
SUM(COALESCE(duration_minutes, 0)) OVER (
PARTITION BY r.rc
ORDER BY TO_DATE(cy||'-01-01', 'YYYY-MM-DD') + week * 7 - 1
RANGE BETWEEN INTERVAL '21' DAY PRECEDING AND CURRENT ROW
) / 4 AS avg_duration
FROM   rcs r
LEFT OUTER JOIN my_table m
PARTITION BY (m.cy, m.week)
ON (r.rc = m.rc)
)
PIVOT (
MAX(avg_duration)
FOR rc IN (
'HCM' AS hcm,
'PALPHABET' AS palphabet,
'PBI' AS pbi,
'SCM' AS scm,
'PBU' AS pbu,
'PDT' AS pdt,
'PIO' AS pio,
'POTHER' AS pother,
'PWC' AS pwc,
'TBD' AS tdb
)
)
ORDER BY cy, week;

对于示例数据,输出如下:

CY WEEK HCM PALPHABET PBI SCM PBU PDT PIO POTHER PWC TDB
2022 6 21.5 0 22.75 0 0 0 0 0 0 0
2022 7 109 0 69.5 0 0 0 0 12.25 0 1
2022 8 147.25 0 254.5 0 0 0 0 142.25 124.75 38.75
2022 9 214.25 32 268 0 0 23 0 142.25 142 38.75
2022 10 269.75 32 245.25 0 52.25 23 187.5 142.25 142 85.75
2022 11 182.25 32 198.5 24.25 112.25 23 191.5 130 142 84.75

fiddle

英文:

You can use a PARTITIONed OUTER JOIN to ensure that there is a row for each rc/cy/week combination and then use the analytic SUM function to get the rolling total for the past 4 weeks and then PIVOT:

WITH rcs (rc) AS (
SELECT 'HCM' FROM DUAL UNION ALL
SELECT 'PALPHABET' FROM DUAL UNION ALL
SELECT 'PBI' FROM DUAL UNION ALL
SELECT 'SCM' FROM DUAL UNION ALL
SELECT 'PBU' FROM DUAL UNION ALL
SELECT 'PDT' FROM DUAL UNION ALL
SELECT 'PIO' FROM DUAL UNION ALL
SELECT 'POTHER' FROM DUAL UNION ALL
SELECT 'PWC' FROM DUAL UNION ALL
SELECT 'TBD' FROM DUAL
)
SELECT *
FROM   (
SELECT cy,
week,
r.rc,
SUM(COALESCE(duration_minutes, 0)) OVER (
PARTITION BY r.rc
ORDER BY TO_DATE(cy||'-01-01', 'YYYY-MM-DD') + week * 7 - 1
RANGE BETWEEN INTERVAL '21' DAY PRECEDING AND CURRENT ROW
) / 4 AS avg_duration
FROM   rcs r
LEFT OUTER JOIN my_table m
PARTITION BY (m.cy, m.week)
ON (r.rc = m.rc)
)
PIVOT (
MAX(avg_duration)
FOR rc IN (
'HCM' AS hcm,
'PALPHABET' AS palphabet,
'PBI' AS pbi,
'SCM' AS scm,
'PBU' AS pbu,
'PDT' AS pdt,
'PIO' AS pio,
'POTHER' AS pother,
'PWC' AS pwc,
'TBD' AS tdb
)
)
ORDER BY cy, week;

Which, for the sample data, outputs:

CY WEEK HCM PALPHABET PBI SCM PBU PDT PIO POTHER PWC TDB
2022 6 21.5 0 22.75 0 0 0 0 0 0 0
2022 7 109 0 69.5 0 0 0 0 12.25 0 1
2022 8 147.25 0 254.5 0 0 0 0 142.25 124.75 38.75
2022 9 214.25 32 268 0 0 23 0 142.25 142 38.75
2022 10 269.75 32 245.25 0 52.25 23 187.5 142.25 142 85.75
2022 11 182.25 32 198.5 24.25 112.25 23 191.5 130 142 84.75

fiddle

答案2

得分: 1

Here is the translated code portion without the code:

You'll possibly want to use MTO's answer, since it operates on true dates (so it will handle the week resetting after the year changes) and is cleaner in how it handles the coalesces. However, I have adjusted my answer to work with your new data and new fields:

WITH ranges AS
(
 select distinct week s FROM my_table
)
select cy, week, COALESCE(HCM, 0) HCM, COALESCE(PALPHABET, 0) PALPHABET, COALESCE(PBI, 0) PBI, COALESCE(SCM, 0) SCM, COALESCE(PBU, 0) PBU, COALESCE(PDT, 0) PDT, COALESCE(PIO, 0) PIO, COALESCE(POTHER, 0) POTHER, COALESCE(PWC, 0) PWC, COALESCE(TBD, 0) TBD
from (
 SELECT cy, s week, rc, SUM(duration_minutes) rolling_avg
 FROM ranges r left outer join my_table m
   ON week > r.s - 4
    AND week <= r.s
 GROUP BY cy,s,rc
)
pivot 
(
   sum(rolling_avg/4)
   for rc IN ('HCM' HCM, 'PALPHABET' PALPHABET, 'PBI' PBI, 'SCM' SCM, 'PBU' PBU, 'PDT' PDT, 'PIO' PIO, 'POTHER' POTHER, 'PWC' PWC, 'TBD' TBD)
)
ORDER BY cy, week

Original answer:

I think you are looking for something like:

WITH ranges AS
(
 select (SELECT MIN(CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, '_') + 1) AS INT)) FROM MY_TABLE) + level - 1 s
 from   dual
 connect by level <= (SELECT MAX(CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, '_') + 1) AS INT)) FROM MY_TABLE) - (SELECT MIN(CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, '_') + 1) AS INT)) FROM MY_TABLE)+1 
)
select cy_week, COALESCE(TBD, 0) TBD, COALESCE(HCM, 0) HCM, COALESCE(SCM, 0) SCM, COALESCE(PIO, 0) PIO, COALESCE(PBU, 0) PBU, COALESCE(FOO, 0) FOO 
from (
 SELECT '2022_'||r.s cy_week, rc, SUM(duration_minutes) rolling_avg
 FROM ranges r left outer join my_table m
   ON CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, '_') + 1) AS INT) > r.s - 4
    AND CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, '_') + 1) AS INT) <= r.s
 GROUP BY s,rc
)
pivot 
(
   sum(rolling_avg/4)
   for rc IN ('TBD' TBD, 'HCM' HCM, 'SCM' SCM, 'PIO' PIO, 'PBU' PBU, 'FOO' FOO)
)
ORDER BY CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, '_') + 1) as int)

Please note that the code is provided as is, and any specific database or context-related adjustments may be required when implementing it.

英文:

You'll possibly want to use MTO's answer, since it operates on true dates (so it will handle the week reseting after the year changes) and is cleaner in how in handles the coalesces. However, I have adjusted my answer to work with your new data and new fields:

WITH ranges AS
(
select distinct week s FROM my_table
)
select cy, week, COALESCE(HCM, 0) HCM, COALESCE(PALPHABET, 0) PALPHABET, COALESCE(PBI, 0) PBI, COALESCE(SCM, 0) SCM, COALESCE(PBU, 0) PBU, COALESCE(PDT, 0) PDT, COALESCE(PIO, 0) PIO, COALESCE(POTHER, 0) POTHER, COALESCE(PWC, 0) PWC, COALESCE(TBD, 0) TBD
from (
SELECT cy, s week, rc, SUM(duration_minutes) rolling_avg
FROM ranges r left outer join my_table m
ON week &gt; r.s - 4
AND week &lt;= r.s
GROUP BY cy,s,rc
)
pivot 
(
sum(rolling_avg/4)
for rc IN (&#39;HCM&#39; HCM, &#39;PALPHABET&#39; PALPHABET, &#39;PBI&#39; PBI, &#39;SCM&#39; SCM, &#39;PBU&#39; PBU, &#39;PDT&#39; PDT, &#39;PIO&#39; PIO, &#39;POTHER&#39; POTHER, &#39;PWC&#39; PWC, &#39;TBD&#39; TBD)
)
ORDER BY cy, week

You can see it in this Fiddle.

Original answer:

I think you are looking for something like:

WITH ranges AS
(
select (SELECT MIN(CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) AS INT)) FROM MY_TABLE) + level - 1 s
from   dual
connect by level &lt;= (SELECT MAX(CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) AS INT)) FROM MY_TABLE) - (SELECT MIN(CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) AS INT)) FROM MY_TABLE)+1 
)
select cy_week, COALESCE(TBD, 0) TBD, COALESCE(HCM, 0) HCM, COALESCE(SCM, 0) SCM, COALESCE(PIO, 0) PIO, COALESCE(PBU, 0) PBU, COALESCE(FOO, 0) FOO 
from (
SELECT &#39;2022_&#39;||r.s cy_week, rc, SUM(duration_minutes) rolling_avg
FROM ranges r left outer join my_table m
ON CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) AS INT) &gt; r.s - 4
AND CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) AS INT) &lt;= r.s
GROUP BY s,rc
)
pivot 
(
sum(rolling_avg/4)
for rc IN (&#39;TBD&#39; TBD, &#39;HCM&#39; HCM, &#39;SCM&#39; SCM, &#39;PIO&#39; PIO, &#39;PBU&#39; PBU, &#39;FOO&#39; FOO)
)
ORDER BY CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) as int)

You can see it working in this Fiddle. This is a bit messy and can likely be cleaned up a bit, but hopefully it's a good starting point.

The ranges cte gets all the weeks between your first and your last date. However, if you know that all weeks exist in your data, you could simplify this to something like

WITH ranges AS
(SELECT DISTINCT CAST(SUBSTR(CY_WEEK, INSTR(CY_WEEK, &#39;_&#39;) + 1) AS INT) s FROM MY_TABLE)

It would also be easier to work with actual dates, since this simply treats the part of the string after the underscore as an integer for comparison. You would need to add logic to handle wrapping (if the week number resets when the year changes). You could also clean this up a bit by putting the substring/cast in one CTE and run the query against that (so that logic is not repeated in a number of places).

Since you explicitly said you want 0s, I wrote out the COALESCEs in the SELECT statement (there might be a better way of doing this). If nulls were fine, you could just use SELECT *.

答案3

得分: 0

以下是翻译好的内容:

  1. 创建一个包含4周数据的临时表。也许您想手动提供这些数据,或者从CURRENT_DATE()计算出来。
  2. 创建一个包含MY_TABLE中所有可能的RC值的临时表:SELECT DISTINCT RC FROM MY_TABLE
  3. 对这两个表执行CROSS JOIN操作。
  4. 对CROSS JOIN的结果与MY_TABLE进行LEFT JOIN操作。
英文:

Not enough material to flesh out the whole answer, but the basic strategy would be to

  1. Create a temporary table with the 4-week data. Maybe you want to supply this manually, or compute it from CURRENT_DATE()
  2. Create a temporary table with all possible values for RC : SELECT DISTINCT RC FROM MY_TABLE
  3. Do a CROSS JOIN of those two tables
  4. Do a LEFT JOIN of pairs from the cross join to MY_TABLE

答案4

得分: 0

以下是您要翻译的内容:

  1. 创建CTE以生成从0到52的周数
  2. 创建CTE以汇总每周的DURATION_MINUTES并填充所有周的数据
  3. 创建CTE,将执行上述数据的4周窗口总和,并设置所需期间的WHERE条件
  4. 旋转结果

最终数据集应如下所示。

请注意,此部分是直接从您提供的代码中提取的内容,不需要进一步翻译。

英文:

With MY_TABLE data as provided you could do it like here:

  1. create cte generating weeks from 0 to 52
WITH
    weeks AS
        (   Select LEVEL - 1 &quot;WEEK&quot; From Dual  Connect By LEVEL &lt;= 53   ),
  1. create cte summing the DURATION_MINUTES per weeks and fill in all the weeks data
    minutes as
        (   Select  DISTINCT y.CY, t.RC,
                    To_Number(CASE WHEN w.WEEK - 3 &lt; 0 THEN Nvl(t.CY, y.CY) - 1 ELSE Nvl(t.CY, y.CY) END) ||
                            LPAD(CASE WHEN w.WEEK - 3 &lt; 0 THEN 53 + w.WEEK - 4 ELSE w.WEEK - 3 END, 2, &#39;0&#39;) &quot;WEEK_FROM&quot;,
                    To_Number(y.CY || LPAD(w.WEEK, 2, &#39;0&#39;)) &quot;WEEK&quot;,
                    Sum(CASE WHEN To_Number(t.CY || LPAD(t.WEEK, 2, &#39;0&#39;)) = To_Number(y.CY || LPAD(w.WEEK, 2, &#39;0&#39;)) 
                             THEN t.DURATION_MINUTES 
                        ELSE 0 END) OVER(Partition By t.RC, y.CY, w.WEEK) &quot;WEEK_MINS&quot;
            From weeks w
            Inner Join (Select CY From MY_TABLE Group By CY ORDER BY CY) y ON(1=1)
            Left Join  MY_TABLE t ON(1 = 1)
            ), 
  1. create cte that will do the 4 week window sums of the above data and set a WHERE condition for the period you want. Without it you will get (with this sample data) 53 rows (looks interesting) where you can see when the durations occured.
    grid AS
        (  Select  CY, RC, To_Number(WEEK_FROM) &quot;WEEK_FROM&quot;, To_Number(WEEK) &quot;WEEK&quot;,
                      Sum(WEEK_MINS) 
                          OVER(Partition By RC Order By WEEK Rows Between 3 Preceding And Current Row) &quot;FOUR_WEEK_MINUTES&quot;,
                      0 as HCM, 0 as PALPHABET, 0 as PBI, 0 as SCM, 0 as PBU, 0 as PDT, 0 as PIO, 0 as POTHER, 0 as PWC, 0 as TBD
            From   minutes
            WHERE    WEEK - WEEK_FROM IN(3, 51)   And
                     CY = 2022 And To_Number(SubStr(WEEK, 5, 2)) Between 6 And 11 
        )
  1. Pivot the result
SELECT  *
FROM (   Select    WEEK &quot;WEEK&quot;, g.RC &quot;RC&quot;, g.FOUR_WEEK_MINUTES &quot;FOUR_WEEK_MINUTES&quot;
        From      grid g
     )
    PIVOT ( SUM(FOUR_WEEK_MINUTES / 4)  FOR RC IN (&#39;HCM&#39; AS HCM, &#39;PALPHABET&#39; AS PALPHABET, &#39;PBI&#39; AS PBI, &#39;SCM&#39; AS SCM, &#39;PBU&#39; AS PBU,
                                                   &#39;PDT&#39; AS PDT, &#39;PIO&#39; AS PIO, &#39;POTHER&#39; AS POTHER, &#39;PWC&#39; AS PWC, &#39;TBD&#39; AS TBD )
          )
ORDER BY WEEK

... and the resulting dataset should be

      WEEK        HCM  PALPHABET        PBI        SCM        PBU        PDT        PIO     POTHER        PWC        TBD
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
    202206       21.5          0      22.75          0          0          0          0          0          0          0 
    202207        109          0       69.5          0          0          0          0      12.25          0          1 
    202208     147.25          0      254.5          0          0          0          0     142.25     124.75      38.75 
    202209     214.25         32        268          0          0         23          0     142.25        142      38.75 
    202210     269.75         32     245.25          0      52.25         23      187.5     142.25        142      85.75 
    202211     182.25         32      198.5      24.25     112.25         23      191.5        130        142      84.75

Didn't test it for periods envolving cross years, but this should proces the data from previous year if it is within a 4 week window.
With current sample data to see the rows from before the 4 weeks window and after it - put the period limitiing the weeks between 5 and 15 to see all affected rows - result is:

      WEEK        HCM  PALPHABET        PBI        SCM        PBU        PDT        PIO     POTHER        PWC        TBD
---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
    202205          0          0          0          0          0          0          0          0          0          0 
    202206       21.5          0      22.75          0          0          0          0          0          0          0 
    202207        109          0       69.5          0          0          0          0      12.25          0          1 
    202208     147.25          0      254.5          0          0          0          0     142.25     124.75      38.75 
    202209     214.25         32        268          0          0         23          0     142.25        142      38.75 
    202210     269.75         32     245.25          0      52.25         23      187.5     142.25        142      85.75 
    202211     182.25         32      198.5      24.25     112.25         23      191.5        130        142      84.75 
    202212        144         32       13.5      24.25     112.25         23      191.5          0      17.25         47 
    202213         77          0          0      24.25     112.25          0      191.5          0          0         47 
    202214          0          0          0      24.25         60          0          4          0          0          0 
    202215          0          0          0          0          0          0          0          0          0          0

huangapple
  • 本文由 发表于 2023年3月4日 04:09:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/75631470.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定