SQL Server 返回只有列 X 发生变化的行。

huangapple go评论59阅读模式
英文:

SQL-Server Return rows where only column X has changed

问题

我有一个在SQL Server中的历史记录表,其中id(复合键的一部分)在记录被更改时会重复。现在,我想要检查具有相同id但仅在col_a中有差异的行。假设以下是数据示例:

row_id | id | col_a  | col_b
------------------------------
1      | 1  |value_a | value_b
2      | 1  |value_a | value_c
3      | 1  |value_d | value_c
4      | 1  |value_e | value_c
5      | 1  |value_f | value_g

我希望我的输出如下:

row_id | id | col_a   | col_b
-------------------------------
3      | 1  | value_d | value_c
4      | 1  | value_e | value_c

我的数据有更多列,但思路是我真的只想返回仅在col_a中已重复以跟踪更改的行,而不包括其他内容。最好不必创建/写入表。我添加了row_id仅用于说明目的,如果需要解决问题,我将不得不根据复合键创建它。

英文:

I have a historized table in SQL Server where the id (part of a composite key) repeats when a record is altered. For now, I would like to check which rows with the same id have a difference ONLY in col_a. Suppose the following data sample:

row_id | id | col_a  | col_b
------------------------------
1      | 1  |value_a | value_b
2      | 1  |value_a | value_c
3      | 1  |value_d | value_c
4      | 1  |value_e | value_c
5      | 1  |value_f | value_g

I would like my output to be the following:

row_id | id | col_a   | col_b
-------------------------------
3      | 1  | value_d | value_c
4      | 1  | value_e | value_c

My data has a lot more columns, but the idea is that I really want to return a table that contains only rows that have been duplicated to track a change in col_a and nothing else. Preferably without having to create/write to a table. I added the row_id for illustration purposes, I would have to create it out of the composite key if required to solve the problem.

答案1

得分: 1

以下是如何使用LAG函数实现的方法:

WITH CTE_Lag AS
(
    SELECT *
    , LAG(col_a) OVER (PARTITION BY id ORDER BY row_id) as prev_col_a 
    , LAG(col_b) OVER (PARTITION BY id ORDER BY row_id) as prev_col_b
    FROM mytable
)
SELECT * FROM CTE_Lag
WHERE col_a <> prev_col_a AND col_b = prev_col_b
英文:

Here is how to achieve using LAG function:

WITH CTE_Lag AS
(
	SELECT *
	 , LAG(col_a) OVER (PARTITION BY id ORDER BY row_id) as prev_col_a 
	 , LAG(col_b) OVER (PARTITION BY id ORDER BY row_id) as prev_col_b
	FROM mytable
)
SELECT * FROM CTE_Lag
WHERE col_a &lt;&gt; prev_col_a AND col_b = prev_col_b

答案2

得分: 1

只需更改一个列,另一个列保持不变,您可以使用 IS (NOT) DISTINCT FROM(假设是 SQL Server 2022,没有标签建议其他情况)来检查 Col_A 不同,而 Col_B 不同,并使用 CTE 中的 LAG/LEAD

建议 您有更多列。在这种情况下,可能更适合使用 INTERSECT。我在 CTE 中使用 ROW_NUMBER,然后对同一 CTE 使用 EXISTS(带有所述 INTERSECT),但对 ROW_NUMBER +1 进行操作。:

WITH RNs AS(
    SELECT row_id,
           id,
           col_a,
           col_b,
           ROW_NUMBER() OVER (PARTITION BY id ORDER BY row_id ASC) AS RN --假定 Row_id 与 id 无关,这不是我们想要的
    FROM dbo.YourTable YT1)
SELECT RN1.row_id,
       RN1.id,
       RN1.col_a,
       RN1.col_b
FROM RNs RN1
     JOIN RNs RN2 ON RN1.id = RN2.id
                 AND RN1.col_a IS DISTINCT FROM RN2.col_a --再次假定是 2022+
                 AND RN1.RN = RN2.RN + 1
WHERE EXISTS (SELECT RN1.col_b, RN1.col_c, RN1.col_d, RN1.col_e
              INTERSECT
              SELECT RN2.col_b, RN2.col_c, RN2.col_d, RN2.col_e);
英文:

With 2 columns, one that needs to change one that doesn't, then you could just do a IS (NOT) DISTINCT FROM (assuming SQL Server 2022 as no tags to suggest otherwise) to check that Col_A is different while Col_b is not with a LAG/LEAD n a CTE

You suggest that you have more columns though. As such an INTERSECT might work better here. I use ROW_NUMBER in a CTE and then an EXISTS (with said INTERSECT) against the same CTE but on the ROW_NUMBER +1.:

WITH RNs AS(
    SELECT row_id,
           id,
           col_a,
           col_b,
           ROW_NUMBER() OVER (PARTITION BY id ORDER BY row_id ASC) AS RN --Asumes Row_id is irrelevant of id, which isn&#39;t what we want
    FROM dbo.YourTable YT1)
SELECT RN1.row_id,
       RN1.id,
       RN1.col_a,
       RN1.col_b
FROM RNs RN1
     JOIN RNs RN2 ON RN1.id = RN2.id
                 AND RN1.col_a IS DISTINCT FROM RN2.col_a --Assumes 2022+ again
                 AND RN1.RN = RN2.RN + 1
WHERE EXISTS (SELECT RN1.col_b, RN1.col_c, RN1.col_d, RN1.col_e
              INTERSECT
              SELECT RN2.col_b, RN2.col_c, RN2.col_d, RN2.col_e);

</details>



# 答案3
**得分**: 0

以下是翻译好的内容:

```sql
创建表
```sql
CREATE TABLE mytable(
   row_id INTEGER  NOT NULL 
  ,id     int  NOT NULL
  ,col_a  VARCHAR(30) NOT NULL
  ,col_b  VARCHAR(30) NOT NULL
);
INSERT INTO mytable
(row_id,id,col_a,col_b) VALUES 
(1,1,'value_a','value_b'),
(2,1,'value_a','value_c'),
(3,1,'value_d','value_c'),
(4,1,'value_e','value_c'),
(5,1,'value_f','value_g'),
(7,1,'value_d','value_c');/*重复的值*/
--(7,1,'value_b','value_c'); /*这不是重复的值,这是全新的值*/

使用 DENSE_RANK

WITH t
     AS (SELECT *,
                DENSE_RANK()
                  OVER(
                    PARTITION BY id, col_b
                    ORDER BY col_a ) id2
         FROM   mytable)


SELECT MIN(row_id) row_id, id, col_a, col_b
FROM   t
WHERE  id2 > 1  
GROUP BY id, col_a, col_b

或者 使用 CTE 的引用到另一个 CTE

WITH t
     AS (SELECT *,
                DENSE_RANK()
                  OVER(
                    PARTITION BY id, col_b
                    ORDER BY col_a ) id2
         FROM   mytable),
/*新的 CTE*/
     a
     AS (SELECT row_id,
                id,
                col_a,
                col_b,
                ROW_NUMBER()
                  OVER(
                    PARTITION BY id, col_a, col_b
                    ORDER BY row_id ) id3
         FROM   t
         WHERE  id2 > 1)
SELECT row_id,
       id,
       col_a,
       col_b
FROM   a
WHERE  id3 = 1  

dbfiddle

英文:

data

CREATE TABLE mytable(
   row_id INTEGER  NOT NULL 
  ,id     int  NOT NULL
  ,col_a  VARCHAR(30) NOT NULL
  ,col_b  VARCHAR(30) NOT NULL
);
INSERT INTO mytable
(row_id,id,col_a,col_b) VALUES 
(1,1,&#39;value_a&#39;,&#39;value_b&#39;),
(2,1,&#39;value_a&#39;,&#39;value_c&#39;),
(3,1,&#39;value_d&#39;,&#39;value_c&#39;),
(4,1,&#39;value_e&#39;,&#39;value_c&#39;),
(5,1,&#39;value_f&#39;,&#39;value_g&#39;),
(7,1,&#39;value_d&#39;,&#39;value_c&#39;);/*Repeated value*/
--(7,1,&#39;value_b&#39;,&#39;value_c&#39;); /*that is not a Repeated value, it is whole new value*/

use DENSE_RANK

WITH t
     AS (SELECT *,
                DENSE_RANK()
                  OVER(
                    partition BY id, col_b
                    ORDER BY col_a ) id2
         FROM   mytable)


SELECT min(row_id) row_id ,id,col_a,col_b
FROM   t
WHERE  id2 &gt; 1  
group by id,col_a,col_b

or use a reference of a CTE to another CTE

WITH t
     AS (SELECT *,
                Dense_rank()
                  OVER(
                    partition BY id, col_b
                    ORDER BY col_a ) id2
         FROM   mytable),
/*new CTE*/
     a
     AS (SELECT row_id,
                id,
                col_a,
                col_b,
                Row_number()
                  OVER(
                    partition BY id, col_a, col_b
                    ORDER BY row_id ) id3
         FROM   t
         WHERE  id2 &gt; 1)
SELECT row_id,
       id,
       col_a,
       col_b
FROM   a
WHERE  id3 = 1  

dbfiddle

huangapple
  • 本文由 发表于 2023年7月12日 20:34:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/76670596.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定