英文:
SQL-Server Return rows where only column X has changed
问题
我有一个在SQL Server中的历史记录表,其中id(复合键的一部分)在记录被更改时会重复。现在,我想要检查具有相同id但仅在col_a中有差异的行。假设以下是数据示例:
row_id | id | col_a | col_b
------------------------------
1 | 1 |value_a | value_b
2 | 1 |value_a | value_c
3 | 1 |value_d | value_c
4 | 1 |value_e | value_c
5 | 1 |value_f | value_g
我希望我的输出如下:
row_id | id | col_a | col_b
-------------------------------
3 | 1 | value_d | value_c
4 | 1 | value_e | value_c
我的数据有更多列,但思路是我真的只想返回仅在col_a中已重复以跟踪更改的行,而不包括其他内容。最好不必创建/写入表。我添加了row_id仅用于说明目的,如果需要解决问题,我将不得不根据复合键创建它。
英文:
I have a historized table in SQL Server where the id (part of a composite key) repeats when a record is altered. For now, I would like to check which rows with the same id have a difference ONLY in col_a. Suppose the following data sample:
row_id | id | col_a | col_b
------------------------------
1 | 1 |value_a | value_b
2 | 1 |value_a | value_c
3 | 1 |value_d | value_c
4 | 1 |value_e | value_c
5 | 1 |value_f | value_g
I would like my output to be the following:
row_id | id | col_a | col_b
-------------------------------
3 | 1 | value_d | value_c
4 | 1 | value_e | value_c
My data has a lot more columns, but the idea is that I really want to return a table that contains only rows that have been duplicated to track a change in col_a and nothing else. Preferably without having to create/write to a table. I added the row_id for illustration purposes, I would have to create it out of the composite key if required to solve the problem.
答案1
得分: 1
以下是如何使用LAG函数实现的方法:
WITH CTE_Lag AS
(
SELECT *
, LAG(col_a) OVER (PARTITION BY id ORDER BY row_id) as prev_col_a
, LAG(col_b) OVER (PARTITION BY id ORDER BY row_id) as prev_col_b
FROM mytable
)
SELECT * FROM CTE_Lag
WHERE col_a <> prev_col_a AND col_b = prev_col_b
英文:
Here is how to achieve using LAG function:
WITH CTE_Lag AS
(
SELECT *
, LAG(col_a) OVER (PARTITION BY id ORDER BY row_id) as prev_col_a
, LAG(col_b) OVER (PARTITION BY id ORDER BY row_id) as prev_col_b
FROM mytable
)
SELECT * FROM CTE_Lag
WHERE col_a <> prev_col_a AND col_b = prev_col_b
答案2
得分: 1
只需更改一个列,另一个列保持不变,您可以使用 IS (NOT) DISTINCT FROM
(假设是 SQL Server 2022,没有标签建议其他情况)来检查 Col_A
不同,而 Col_B
不同,并使用 CTE 中的 LAG
/LEAD
。
您 建议 您有更多列。在这种情况下,可能更适合使用 INTERSECT
。我在 CTE 中使用 ROW_NUMBER
,然后对同一 CTE 使用 EXISTS
(带有所述 INTERSECT
),但对 ROW_NUMBER
+1 进行操作。:
WITH RNs AS(
SELECT row_id,
id,
col_a,
col_b,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY row_id ASC) AS RN --假定 Row_id 与 id 无关,这不是我们想要的
FROM dbo.YourTable YT1)
SELECT RN1.row_id,
RN1.id,
RN1.col_a,
RN1.col_b
FROM RNs RN1
JOIN RNs RN2 ON RN1.id = RN2.id
AND RN1.col_a IS DISTINCT FROM RN2.col_a --再次假定是 2022+
AND RN1.RN = RN2.RN + 1
WHERE EXISTS (SELECT RN1.col_b, RN1.col_c, RN1.col_d, RN1.col_e
INTERSECT
SELECT RN2.col_b, RN2.col_c, RN2.col_d, RN2.col_e);
英文:
With 2 columns, one that needs to change one that doesn't, then you could just do a IS (NOT) DISTINCT FROM
(assuming SQL Server 2022 as no tags to suggest otherwise) to check that Col_A
is different while Col_b
is not with a LAG
/LEAD
n a CTE
You suggest that you have more columns though. As such an INTERSECT
might work better here. I use ROW_NUMBER
in a CTE and then an EXISTS
(with said INTERSECT
) against the same CTE but on the ROW_NUMBER
+1.:
WITH RNs AS(
SELECT row_id,
id,
col_a,
col_b,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY row_id ASC) AS RN --Asumes Row_id is irrelevant of id, which isn't what we want
FROM dbo.YourTable YT1)
SELECT RN1.row_id,
RN1.id,
RN1.col_a,
RN1.col_b
FROM RNs RN1
JOIN RNs RN2 ON RN1.id = RN2.id
AND RN1.col_a IS DISTINCT FROM RN2.col_a --Assumes 2022+ again
AND RN1.RN = RN2.RN + 1
WHERE EXISTS (SELECT RN1.col_b, RN1.col_c, RN1.col_d, RN1.col_e
INTERSECT
SELECT RN2.col_b, RN2.col_c, RN2.col_d, RN2.col_e);
</details>
# 答案3
**得分**: 0
以下是翻译好的内容:
```sql
创建表
```sql
CREATE TABLE mytable(
row_id INTEGER NOT NULL
,id int NOT NULL
,col_a VARCHAR(30) NOT NULL
,col_b VARCHAR(30) NOT NULL
);
INSERT INTO mytable
(row_id,id,col_a,col_b) VALUES
(1,1,'value_a','value_b'),
(2,1,'value_a','value_c'),
(3,1,'value_d','value_c'),
(4,1,'value_e','value_c'),
(5,1,'value_f','value_g'),
(7,1,'value_d','value_c');/*重复的值*/
--(7,1,'value_b','value_c'); /*这不是重复的值,这是全新的值*/
使用 DENSE_RANK
WITH t
AS (SELECT *,
DENSE_RANK()
OVER(
PARTITION BY id, col_b
ORDER BY col_a ) id2
FROM mytable)
SELECT MIN(row_id) row_id, id, col_a, col_b
FROM t
WHERE id2 > 1
GROUP BY id, col_a, col_b
WITH t
AS (SELECT *,
DENSE_RANK()
OVER(
PARTITION BY id, col_b
ORDER BY col_a ) id2
FROM mytable),
/*新的 CTE*/
a
AS (SELECT row_id,
id,
col_a,
col_b,
ROW_NUMBER()
OVER(
PARTITION BY id, col_a, col_b
ORDER BY row_id ) id3
FROM t
WHERE id2 > 1)
SELECT row_id,
id,
col_a,
col_b
FROM a
WHERE id3 = 1
英文:
data
CREATE TABLE mytable(
row_id INTEGER NOT NULL
,id int NOT NULL
,col_a VARCHAR(30) NOT NULL
,col_b VARCHAR(30) NOT NULL
);
INSERT INTO mytable
(row_id,id,col_a,col_b) VALUES
(1,1,'value_a','value_b'),
(2,1,'value_a','value_c'),
(3,1,'value_d','value_c'),
(4,1,'value_e','value_c'),
(5,1,'value_f','value_g'),
(7,1,'value_d','value_c');/*Repeated value*/
--(7,1,'value_b','value_c'); /*that is not a Repeated value, it is whole new value*/
use DENSE_RANK
WITH t
AS (SELECT *,
DENSE_RANK()
OVER(
partition BY id, col_b
ORDER BY col_a ) id2
FROM mytable)
SELECT min(row_id) row_id ,id,col_a,col_b
FROM t
WHERE id2 > 1
group by id,col_a,col_b
or use a reference of a CTE to another CTE
WITH t
AS (SELECT *,
Dense_rank()
OVER(
partition BY id, col_b
ORDER BY col_a ) id2
FROM mytable),
/*new CTE*/
a
AS (SELECT row_id,
id,
col_a,
col_b,
Row_number()
OVER(
partition BY id, col_a, col_b
ORDER BY row_id ) id3
FROM t
WHERE id2 > 1)
SELECT row_id,
id,
col_a,
col_b
FROM a
WHERE id3 = 1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论