如何检索具有匹配数据的所有行的主键(多列)的建议:

huangapple go评论64阅读模式
英文:

Suggestions for how to retrieve primary keys for all rows with matching data (many columns)

问题

以下是您提供的内容的中文翻译:

我有一个SQL Server流程,为了节省存储空间,我们需要为每个唯一记录存储一行,并使用行的主键引用唯一数据集。

示例情景:给定一个类似以下的SQL Server表:

DataTable 
(
    DataID bigint, 
    name nvarchar(100), 
    birthday date, 
    value1 nvarchar(100), 
    value2 nvarchar(100)
)

注意:实际上有大约30列。

当接收到新数据时,我们需要将新数据组合添加到DataTable中,并检索已接收到的所有行的DataID,包括新数据和现有数据。

一个想法是利用EXCEPT,然后也许是INTERSECT,但我无法想出一种从已识别的行中获取ID的方法。

这个需求的更大背景是,我们有另一个表,用于跟踪接收到的所有数据,并且我们希望每天跟踪接收到哪些唯一数据集,而不必存储多份数据的副本。

数据跟踪表看起来类似于这样:

ReceivedData (ID bigint, ReceivedDate datetime, DataID bigint)

我目前脑海中的这个需求的流程可能看起来像这样:

给定一个新的数据表和前面提到的现有DataTable:

NewDataTable 
(
    name nvarchar(100), 
    birthday date, 
    value1 nvarchar(100), 
    value2 nvarchar(100)
)

INSERT INTO DataTable (name, birthday, value1, value2)
 	SELECT name, birthday, value1, value2 
    FROM NewDataTable
	EXCEPT
	SELECT name, birthday, value1, value2 
    FROM DataTable

我们可能可以在这里使用OUTPUT子句来获取新记录创建的DataID,但我们还需要DataID,用于匹配DataTable中现有记录的NewDataTable中的记录。

我知道我可以使用NewDataTable和现有DataTable之间的联接,使用所有表列,但这将是一个重型联接,我希望有更好的方法。

感谢您的所有时间和想法。

示例数据演示

现有[DataTable]记录:

DataID name birthday value1 value2
1 John 1/1/1990 abc 123
2 Sally 2/2/1990 def 321
3 Steve 3/3/1990 ghi 333

现有[ReceivedData]记录:

ID RecievedDate DataID
100 5/1/2023 03:01 1
101 5/1/2023 03:01 2
102 5/1/2023 03:01 3

然后,如果在“5/2/2023 03:00”接收到新数据,如下所示:

在[NewDataTable]中接收到的新数据

name birthday value1 value2
John 1/1/1990 abc 123
Steve 3/3/1990 ghi 333
Alice 4/3/1990 aaa 222

经过流程运行后,所需的数据状态如下:

生成的[DataTable]记录:

DataID name birthday value1 value2
1 John 1/1/1990 abc 123
2 Sally 2/2/1990 def 321
3 Steve 3/3/1990 ghi 333
4 Alice 4/3/1990 aaa 222

生成的[ReceivedData]记录:

ID RecievedDate DataID
100 5/1/2023 03:01 1
101 5/1/2023 03:01 2
102 5/1/2023 03:01 3
103 5/2/2023 03:00 1
104 5/2/2023 03:00 3
105 5/2/2023 03:00 4
英文:

I have a SQL Server process where to save storage we need to store one row per unique record and reference the unique data set using the row's primary key.

Example scenario: given a SQL Server table similar to the following:

DataTable 
(
    DataID bigint, 
    name nvarchar(100), 
    birthday date, 
    value1 nvarchar(100), 
    value2 nvarchar(100)
)

Note: in reality there are ~30 columns

When new data is received we need to add the new data combinations to the DataTable and retrieve the DataIDs for all rows that were received, both new and existing.

One idea was to leverage EXCEPT and then perhaps INTERSECT, but I am unable to think of a way to get the IDs back from the rows identified.

The bigger picture for this need is that we have another table which is used to track all data received and we want to keep track of which unique data sets were received each day without having to store multiple copies of the data.

The data tracking table looks something like this:

ReceivedData (ID bigint, ReceivedDate datetime, DataID bigint)

The process currently in my head for this need could look something like this:

Given a new data in a table and the previously mentioned existing DataTable:

NewDataTable 
(
    name nvarchar(100), 
    birthday date, 
    value1 nvarchar(100), 
    value2 nvarchar(100)
)

INSERT INTO DataTable (name, birthday, value1, value2)
 	SELECT name, birthday, value1, value2 
    FROM NewDataTable
	EXCEPT
	SELECT name, birthday, value1, value2 
    FROM DataTable

We could probably use an OUTPUT clause here to get the DataIDs for the new records created, but we also need the DataIDs for records in the NewDataTable which match existing records in DataTable.

I know I could use join between the NewDataTable and the existing DataTable using all table columns, but this would be a heavy join and I'm hoping for a better way.

Thank you for all your time and ideas.

Example Data Walkthrough

Existing [DataTable] records:

DataID name birthday value1 value2
1 John 1/1/1990 abc 123
2 Sally 2/2/1990 def 321
3 Steve 3/3/1990 ghi 333

Existing [ReceivedData] records:

ID RecievedDate DataID
100 5/1/2023 03:01 1
101 5/1/2023 03:01 2
102 5/1/2023 03:01 3

Then, if new data is received on "5/2/2023 03:00" as follows:

New data received in [NewDataTable]

name birthday value1 value2
John 1/1/1990 abc 123
Steve 3/3/1990 ghi 333
Alice 4/3/1990 aaa 222

The desired data state after the process runs would be:

Resulting [DataTable] records:

DataID name birthday value1 value2
1 John 1/1/1990 abc 123
2 Sally 2/2/1990 def 321
3 Steve 3/3/1990 ghi 333
4 Alice 4/3/1990 aaa 222

Resulting [ReceivedData] records:

ID RecievedDate DataID
100 5/1/2023 03:01 1
101 5/1/2023 03:01 2
102 5/1/2023 03:01 3
103 5/2/2023 03:00 1
104 5/2/2023 03:00 3
105 5/2/2023 03:00 4

答案1

得分: 0

以下是您提供的内容的中文翻译:

-- 数据模型
创建表格 DataTable(DataId int primary key identity,name nvarchar(100),birthday date
,value1 nvarchar(100),value2 nvarchar(100));
启用标识插入 DataTable;
插入数据到 DataTable (DataId,name,birthday,value1,value2) 
 (1,'John','1990-01-01','abc','123')
,(2,'Sally','1990-03-03','def','321')
,(3,'Steve','1990-03-03','ghi','333')
;
关闭标识插入 DataTable;

创建表格 NewDataTable(name nvarchar(100),birthday date
,value1 nvarchar(100),value2 nvarchar(100));
插入数据到 NewDataTable 
 ('John','1990-01-01','abc','123')
,('Steve','1990-03-03','ghi','333')
,('Alice','1990-04-03','aaa','222')
;
创建表格 LogTable(id int identity,actiondate datetime,action varchar(10),DataId int ,name nvarchar(100),birthday date
,value1 nvarchar(100),value2 nvarchar(100));

-- 查询
合并 DataTable
使用 NewDataTable n on n.name=DataTable.name -- ...等等
当不匹配时
  插入(name,birthday,value1,value2) (n.name,n.birthday,n.value1,n.value2)
当匹配时
   更新设置 DataTable.name=n.name
输出  
       getdate(),$action,插入的.*
 LogTable;

-- 结果
选择 * from DataTable;
选择 * From LogTable;

查询后的 DataTable 表格

DataId name birthday value1 value2
1 John 1990-01-01 abc 123
2 Sally 1990-03-03 def 321
3 Steve 1990-03-03 ghi 333
4 Alice 1990-04-03 aaa 222

查询后的 LogTable 表格

id actiondate action DataId name birthday value1 value2
1 2023-05-17 20:17:13.520 INSERT 4 Alice 1990-04-03 aaa 222
2 2023-05-17 20:17:13.520 UPDATE 1 John 1990-01-01 abc 123
3 2023-05-17 20:17:13.520 UPDATE 3 Steve 1990-03-03 ghi 333

示例链接

还有执行计划 (https://dbfiddle.uk/yXVgKV92)


<details>
<summary>英文:</summary>

Try this  

-- data model
create table DataTable(DataId int primary key identity,name nvarchar(100),birthday date
,value1 nvarchar(100),value2 nvarchar(100));
set identity_insert DataTable ON;
insert into DataTable (DataId,name,birthday,value1,value2) values
(1,'John','1990-01-01','abc','123')
,(2,'Sally','1990-03-03','def','321')
,(3,'Steve','1990-03-03','ghi','333')
;
set identity_insert DataTable OFF;

create table NewDataTable(name nvarchar(100),birthday date
,value1 nvarchar(100),value2 nvarchar(100));
insert into NewDataTable values
('John','1990-01-01','abc','123')
,('Steve','1990-03-03','ghi','333')
,('Alice','1990-04-03','aaa','222')
;
create table LogTable(id int identity,actiondate datetime,action varchar(10),DataId int ,name nvarchar(100),birthday date
,value1 nvarchar(100),value2 nvarchar(100));

-- query
merge DataTable
using NewDataTable n on n.name=DataTable.name -- ... etc
when not matched then
insert(name,birthday,value1,value2) values(n.name,n.birthday,n.value1,n.value2)
when matched then
update set DataTable.name=n.name
output
getdate(),$action,inserted.*
into LogTable;

-- results
select * from DataTable;
select * From LogTable;

DataTable after query  

|DataId|name|birthday|value1|	value2|
|-:|:----|:-----------|:--|:--|
|1|	John|	1990-01-01|	abc|	123|
|2|	Sally|	1990-03-03|	def|	321|
|3|	Steve|	1990-03-03|	ghi|	333|
|4|	Alice|	1990-04-03|	aaa|	222|

LogTable after query  

|id|actiondate|	action|	DataId|	name|    birthday|value1|	value2|
|-:|:----------------------|:-----|-:|:----|:---------|:---|:---|
|1|	2023-05-17 20:17:13.520|INSERT|	4|Alice|1990-04-03|	aaa|	222|
|2|	2023-05-17 20:17:13.520|UPDATE|	1|John|	1990-01-01|	abc|	123|
|3|	2023-05-17 20:17:13.520|UPDATE|	3|Steve|1990-03-03|	ghi|	333|

[Example here](https://dbfiddle.uk/Pl7IyRYR)     
   
and with plan (https://dbfiddle.uk/yXVgKV92)

</details>



huangapple
  • 本文由 发表于 2023年5月17日 22:54:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/76273460.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定