如何在两个日期之间以YYYYMM格式衍生新列。

huangapple go评论60阅读模式
英文:

How to derive new columns in YYYYMM format between two dates

问题

我有以下的表格:

| ID | Name | StartDate | EndDate |
| -------- | -------- |
| 1 | Aa | 2021-10-14 | 2021-12-22 |
| 2 | Ab | 2021-12-02 | 2022-10-05 |

要求是在YYYYMM格式中添加新的列,包括在min(StartDate)和max(EndDate)之间的所有月份,并为相应的单元格赋值。如果日期在该行的StartDate和EndDate之间,则单元格的值应为1,如果不在该日期范围内,则应为0。最终输出应该如下表所示:

| ID | Name | StartDate | EndDate | 202110 | 202111 | 202112 | 202201 | 202202 | 202203 | 202204 | 202205 | 202206 | 202207 |
| -------- | -------- |
| 1 | Aa | 2021-10-14 | 2021-12-22 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | Ab | 2021-12-02 | 2022-07-05 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

对于第一行,由于StartDate=2021-10-14和EndDate=2021-12-22,相应的新列应该是202110、202111和202112;对于这些列,相应的单元格值为1,而其他单元格值为0。同样的逻辑也适用于其他行。

我无法找出推导新表格以及新列和相应单元格值的逻辑。

英文:

I have following table

| ID | Name |StartDate |EndDate |
| -------- | -------- |
| 1 | Aa |2021-10-14 |2021-12-22 |
| 2 | Ab |2021-12-02 |2022-10-05 |

The requirement is to add new columns in YYYYMM format consisting of all the months between min(StartDate) and max(EndDate), and assign values to the corresponding cells. The cell value should be 1 if the date lies between StartDate and EndDate in that row, and should be 0 if it does not fall within that date range.The final output should be like in the below table

| ID | Name |StartDate |EndDate |202110|202111|202112|202201|202202|202203|202204|202205|202206|202207|
| -------- | -------- |
| 1 | Aa |2021-10-14 |2021-12-22 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | Ab |2021-12-02 |2022-07-05 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

For the 1st row, since the startdate=2021-10-14 & EndDate=2021-12-22, corresponding new columns should be 202110,202111 & 202112; and corresponding cell values are 1 for these columns while the other cells are 0. Same logic should be applied to other rows as well.

I could not figure out the logic to derive new tables with new columns and corresponding cell values.

答案1

得分: 1

使用动态SQL的方法如何?:

CREATE TABLE Data (ID INT, Name VARCHAR(100), StartDate DATE, EndDate DATE)
INSERT Data
VALUES
    ('1', 'Aa', '2021-10-14', '2021-12-22'),
    ('2', 'Ab', '2021-12-02', '2022-10-05')

-- 提取整体日期范围
DECLARE @MinStartDate DATE, @MaxEndDate AS DATE
SELECT @MinStartDate = MIN(StartDate), @MaxEndDate = MAX(EndDate)
FROM Data

-- 转换为月份(在SQL Server 2022及更高版本中可以使用DATETRUNC())
DECLARE @StartMonth DATE = DATEADD(day, 1 - DAY(@MinStartDate), @MinStartDate)
DECLARE @EndMonth DATE = DATEADD(day, 1 - DAY(@MaxEndDate), @MaxEndDate)

-- 生成在范围内的月份日历
DECLARE @Months TABLE (Month DATE)
;WITH Months AS (
   SELECT @StartMonth AS Month
   UNION ALL
   SELECT DATEADD(month, 1, M.Month)
   FROM Months M
   WHERE M.Month < @EndMonth
)
INSERT @Months
SELECT M.Month
FROM Months M

-- 定义SQL模板
DECLARE @SqlTemplate VARCHAR(MAX) = '
SELECT ID, Name, StartDate, EndDate
<ColumnSql>
FROM Data D
ORDER BY D.Name
';

DECLARE @ColumnTemplate VARCHAR(MAX) = '
    , CASE WHEN D.StartDate <= <MonthEnd> AND <MonthStart> <= D.EndDate THEN 1 ELSE 0 END AS <ColumnName>';

-- 从模板构建特定月份的列选择项
DECLARE @ColumnSql VARCHAR(MAX) = (
    SELECT STRING_AGG(C.ColumnSql, '') WITHIN GROUP(ORDER BY M.Month)
    FROM @Months M
    CROSS APPLY (
       SELECT
            CONVERT(CHAR(6), M.Month, 112) AS ColumnName,
            M.Month AS MonthStart,
            EOMONTH(M.Month) AS MonthEnd
    ) MD
    CROSS APPLY (
        SELECT REPLACE(REPLACE(REPLACE(
            @ColumnTemplate
            , '<ColumnName>', QUOTENAME(MD.ColumnName))
            , '<MonthStart>', QUOTENAME(CONVERT(CHAR(8), MD.MonthStart, 112), ''''''))
            , '<MonthEnd>', QUOTENAME(CONVERT(CHAR(8), MD.MonthEnd, 112), ''''''))
            AS ColumnSql
    ) C
)

-- 生成最终的SQL
DECLARE @Sql VARCHAR(MAX) = REPLACE(@SqlTemplate, '<ColumnSql>', @ColumnSql)

SELECT @Sql

-- 执行
EXEC (@Sql)

生成的SQL:

SELECT ID, Name, StartDate, EndDate

    , CASE WHEN D.StartDate <= '20211031' AND D.EndDate >= '20211001' THEN 1 ELSE 0 END AS [202110]
    , CASE WHEN D.StartDate <= '20211130' AND D.EndDate >= '20211101' THEN 1 ELSE 0 END AS [202111]
    , CASE WHEN D.StartDate <= '20211231' AND D.EndDate >= '20211201' THEN 1 ELSE 0 END AS [202112]
    , CASE WHEN D.StartDate <= '20220131' AND D.EndDate >= '20220101' THEN 1 ELSE 0 END AS [202201]
    , CASE WHEN D.StartDate <= '20220228' AND D.EndDate >= '20220201' THEN 1 ELSE 0 END AS [202202]
    , CASE WHEN D.StartDate <= '20220331' AND D.EndDate >= '20220301' THEN 1 ELSE 0 END AS [202203]
    , CASE WHEN D.StartDate <= '20220430' AND D.EndDate >= '20220401' THEN 1 ELSE 0 END AS [202204]
    , CASE WHEN D.StartDate <= '20220531' AND D.EndDate >= '20220501' THEN 1 ELSE 0 END AS [202205]
    , CASE WHEN D.StartDate <= '20220630' AND D.EndDate >= '20220601' THEN 1 ELSE 0 END AS [202206]
    , CASE WHEN D.StartDate <= '20220731' AND D.EndDate >= '20220701' THEN 1 ELSE 0 END AS [202207]
    , CASE WHEN D.StartDate <= '20220831' AND D.EndDate >= '20220801' THEN 1 ELSE 0 END AS [202208]
    , CASE WHEN D.StartDate <= '20220930' AND D.EndDate >= '20220901' THEN 1 ELSE 0 END AS [202209]
    , CASE WHEN D.StartDate <= '20221031' AND D.EndDate >= '20221001' THEN 1 ELSE 0 END AS [202210]
FROM Data D
ORDER BY D.Name

结果:

| ID | Name | StartDate | EndDate | 202110 | 202111 | 202112 | 202201 | 202202 | 202203 | 202204 | 202205 | 202206 | 202207 | 202208 | 202209 | 202210 |
| - | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1 | Aa | 2021-10-14 | 2021-12-22 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | Ab | 2021-12-02 | 2022-10-05 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |

请参阅此db<>fiddle。以上使用了日期范围重叠的标准测试,即"start1 <= end2 AND start2 <= end1",假定所有日期都是包含的。

英文:

How about this approach using dynamic SQL:

CREATE TABLE Data ( ID INT, Name VARCHAR(100), StartDate DATE, EndDate DATE )
INSERT Data
VALUES
(&#39;1&#39;, &#39;Aa&#39;, &#39;2021-10-14&#39;, &#39;2021-12-22&#39;),
(&#39;2&#39;, &#39;Ab&#39;, &#39;2021-12-02&#39;, &#39;2022-10-05&#39;)
-- Extract overall date range
DECLARE @MinStartDate DATE, @MaxEndDate AS DATE
SELECT @MinStartDate = MIN(StartDate), @MaxEndDate =  MAX(EndDate)
FROM Data
-- Convert to months (DATETRUNC() may be used instead with SQL Server 2022 and later)
DECLARE @StartMonth DATE = DATEADD(day, 1 - DAY(@MinStartDate), @MinStartDate)
DECLARE @EndMonth DATE = DATEADD(day, 1 - DAY(@MaxEndDate), @MaxEndDate)
-- Generate calendar of months within range
DECLARE @Months TABLE ( Month DATE)
;WITH Months AS (
SELECT @StartMonth AS Month
UNION ALL
SELECT DATEADD(month, 1, M.Month)
FROM Months M
WHERE M.Month &lt; @EndMonth
)
INSERT @Months
SELECT M.Month
FROM Months M
-- Define SQL Templates
DECLARE @SqlTemplate VARCHAR(MAX) = &#39;
SELECT ID, Name, StartDate, EndDate
&lt;ColumnSql&gt;
FROM Data D
ORDER BY D.Name
&#39;
DECLARE @ColumnTemplate VARCHAR(MAX) = &#39;
, CASE WHEN D.StartDate &lt;= &lt;MonthEnd&gt; AND &lt;MonthStart&gt; &lt;= D.EndDate THEN 1 ELSE 0 END AS &lt;ColumnName&gt;&#39;
-- Build month-specific column select items from template
DECLARE @ColumnSql VARCHAR(MAX) = (
SELECT STRING_AGG(C.ColumnSql, &#39;&#39;) WITHIN GROUP(ORDER BY M.Month)
FROM @Months M
CROSS APPLY (
SELECT
CONVERT(CHAR(6), M.Month, 112) AS ColumnName,
M.Month AS MonthStart,
EOMONTH(M.Month) AS MonthEnd
) MD
CROSS APPLY (
SELECT REPLACE(REPLACE(REPLACE(
@ColumnTemplate
, &#39;&lt;ColumnName&gt;&#39;, QUOTENAME(MD.ColumnName))
, &#39;&lt;MonthStart&gt;&#39;, QUOTENAME(CONVERT(CHAR(8), MD.MonthStart, 112), &#39;&#39;&#39;&#39;))
, &#39;&lt;MonthEnd&gt;&#39;, QUOTENAME(CONVERT(CHAR(8), MD.MonthEnd, 112), &#39;&#39;&#39;&#39;))
AS ColumnSql
) C
)
--SELECT @ColumnSql
-- Build final SQL
DECLARE @Sql VARCHAR(MAX) = REPLACE(@SqlTemplate, &#39;&lt;ColumnSql&gt;&#39;, @ColumnSql)
SELECT @Sql
-- Deliver
EXEC (@Sql)

Generated SQL:

SELECT ID, Name, StartDate, EndDate
    , CASE WHEN D.StartDate &lt;= &#39;20211031&#39; AND D.EndDate &gt;= &#39;20211001&#39; THEN 1 ELSE 0 END AS [202110]
    , CASE WHEN D.StartDate &lt;= &#39;20211130&#39; AND D.EndDate &gt;= &#39;20211101&#39; THEN 1 ELSE 0 END AS [202111]
    , CASE WHEN D.StartDate &lt;= &#39;20211231&#39; AND D.EndDate &gt;= &#39;20211201&#39; THEN 1 ELSE 0 END AS [202112]
    , CASE WHEN D.StartDate &lt;= &#39;20220131&#39; AND D.EndDate &gt;= &#39;20220101&#39; THEN 1 ELSE 0 END AS [202201]
    , CASE WHEN D.StartDate &lt;= &#39;20220228&#39; AND D.EndDate &gt;= &#39;20220201&#39; THEN 1 ELSE 0 END AS [202202]
    , CASE WHEN D.StartDate &lt;= &#39;20220331&#39; AND D.EndDate &gt;= &#39;20220301&#39; THEN 1 ELSE 0 END AS [202203]
    , CASE WHEN D.StartDate &lt;= &#39;20220430&#39; AND D.EndDate &gt;= &#39;20220401&#39; THEN 1 ELSE 0 END AS [202204]
    , CASE WHEN D.StartDate &lt;= &#39;20220531&#39; AND D.EndDate &gt;= &#39;20220501&#39; THEN 1 ELSE 0 END AS [202205]
    , CASE WHEN D.StartDate &lt;= &#39;20220630&#39; AND D.EndDate &gt;= &#39;20220601&#39; THEN 1 ELSE 0 END AS [202206]
    , CASE WHEN D.StartDate &lt;= &#39;20220731&#39; AND D.EndDate &gt;= &#39;20220701&#39; THEN 1 ELSE 0 END AS [202207]
    , CASE WHEN D.StartDate &lt;= &#39;20220831&#39; AND D.EndDate &gt;= &#39;20220801&#39; THEN 1 ELSE 0 END AS [202208]
    , CASE WHEN D.StartDate &lt;= &#39;20220930&#39; AND D.EndDate &gt;= &#39;20220901&#39; THEN 1 ELSE 0 END AS [202209]
    , CASE WHEN D.StartDate &lt;= &#39;20221031&#39; AND D.EndDate &gt;= &#39;20221001&#39; THEN 1 ELSE 0 END AS [202210]
FROM Data D
ORDER BY D.Name

Results:

ID Name StartDate EndDate 202110 202111 202112 202201 202202 202203 202204 202205 202206 202207 202208 202209 202210
1 Aa 2021-10-14 2021-12-22 1 1 1 0 0 0 0 0 0 0 0 0 0
2 Ab 2021-12-02 2022-10-05 0 0 1 1 1 1 1 1 1 1 1 1 1

See this db<>fiddle.

The above uses a standard test for date range overlap of "start1 <= end2 AND start2 <= end1", which assumes all dates are inclusive.

huangapple
  • 本文由 发表于 2023年2月7日 00:34:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/75364084.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定