2023年7月14日 03:42:27go评论89阅读模式

英文:

Delete from SQL table if there are Duplicates AND if duplicates are older than 30 days

问题

我正在尝试编写一个SQL查询来访问Microsoft SQL表。我希望实现的目标是，只有当重复行的时间超过30天时，才删除重复行。这是一个示例表格：

INSERT INTO [dbo].[test]
(id1, id2, firstName, lastName, dayTime) 
VALUES
    (12, 13, 'Syed','Abbas','05-02-2023'),
    (12, 13, 'Syed','Abbas','07-02-2023'),
	(12, 14, 'Adam', 'Johnson', '07-02-2023'),
	(10, 9, 'Monique', 'Brown', '03-03-2023')

以下是我为查询编写的内容：

DELETE T
FROM
(
SELECT *
, DupRank = ROW_NUMBER() OVER (
              PARTITION BY id1, id2
              ORDER BY (SELECT NULL)
            )
FROM [dbo].[test]
) AS T
WHERE DupRank > 1 and dayTime < DATEADD(day, -30, GETDATE())

我试图实现的结果是只删除第一行（12, 13, Syed, Abbas, 05-02-2023），其余的值将保留。但是，当我运行这个查询时，它不会删除任何内容——没有错误，只是0行受到影响。

我已经尝试了查询的各个部分，它们都正常工作（例如，当我只删除重复行时，它会删除第二行，当我只删除30天前的行时，它会删除第一行和第四行）。我不确定是否错误地使用了“and”子句？

英文:

I am trying to write an SQL query to access a Microsoft SQL table. What I am hoping to accomplish is that I can find all rows that have duplicates and delete duplicates only if they are older than 30 days. Here is an example table:

    INSERT INTO [dbo].[test]
    (id1, id2, firstName, lastName, dayTime) 
VALUES
    (12, 13, &#39;Syed&#39;,&#39;Abbas&#39;,&#39;05-02-2023&#39;),
    (12, 13, &#39;Syed&#39;,&#39;Abbas&#39;,&#39;07-02-2023&#39;),
	(12, 14, &#39;Adam&#39;, &#39;Johnson&#39;, &#39;07-02-2023&#39;),
	(10, 9, &#39;Monique&#39;, &#39;Brown&#39;, &#39;03-03-2023&#39;)

And this is what I have written for my query:

    DELETE T
FROM
(
SELECT *
, DupRank = ROW_NUMBER() OVER (
              PARTITION BY id1, id2
              ORDER BY (SELECT NULL)
            )
FROM [dbo].[test]
) AS T
WHERE DupRank &gt; 1 and dayTime &lt; DATEADD(day, -30, GETDATE())

The outcome I am trying to get is that only row 1 (12, 13, Syed, Abbas, 05-02-2023) will be deleted and the rest of the values will stay. However, when I run this query, it does not delete anything-- no errors, just 0 rows affected.

I have tried the separate parts of the query and they work fine (ie, when I just delete duplicates, it removes row 2, and when I just delete for older than 30 days, it removes rows 1 and 4). I am not sure if I am using the "and" clause incorrectly?

答案1

得分: 1

我猜测（虽然没有看到查询计划很难确定）非确定性的 ORDER BY 导致了问题。

当你写 ORDER BY (SELECT NULL) 时，这意味着服务器可以以任何顺序计算行号。所以可能较旧的行被标记为1，而较新的行被标记为2。然后，当你筛选 DupRank > 1 and dayTime < DATEADD(day, -30, GETDATE()) 时，你会筛选掉两行。

所以只需使用确定性的编号。在这里合理的做法是从新到旧编号，以便始终保留最新的行和任何在30天内的其他行。

DELETE T
FROM
(
    SELECT *,
      DupRank = ROW_NUMBER() OVER (
                PARTITION BY T.id1, T.id2
                ORDER BY T.dayTime DESC)
    FROM dbo.test T
) AS T
WHERE T.DupRank &gt; 1
  AND T.dayTime &lt; DATEADD(day, -30, GETDATE());

英文:

I'm guessing (although it's hard to say without seeing the query plan) that the non-deterministic ORDER BY is causing problems.

When you write ORDER BY (SELECT NULL) that means that the server is free to calculate the row-number in any order. So it could be that the older row is being numbered 1 and the newer row 2. Then when you filter to DupRank > 1 and dayTime < DATEADD(day, -30, GETDATE()) you are filtering out both rows.

So just use a deterministic numbering. The logical thing to do here would be to number from newest to oldest, so that you always keep the newest row and any others which are less than 30 days old.

DELETE T
FROM
(
    SELECT *,
      DupRank = ROW_NUMBER() OVER (
                PARTITION BY T.id1, T.id2
                ORDER BY T.dayTime DESC)
    FROM dbo.test T
) AS T
WHERE T.DupRank &gt; 1
  AND T.dayTime &lt; DATEADD(day, -30, GETDATE());

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

从SQL表中删除重复项，如果重复项的时间超过30天。

问题

答案1

按累加列的最大值分割为行。

如何在PowerBI工作区中获取所有数据集的刷新历史记录。

独立的JPA序列

如何使用最新的 SDK 创建或更新证书

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论