英文:
SQL update field from another table by selecting a value based on table row
问题
我有一个数据库表,其中保存了由数据创建工具创建的随机个人详细信息。
我想要使用这个表中的值来更新另一个表中的字段,以便在我们在另一台服务器上排除问题时掩盖真实的个人详细信息。
表格如下:
表格 1:tblRandom_Data
字段:First_Name | Surname
表格 2:tblCustomers
字段:CustomerID[GUID] | First_Name | Surname | Address_Line1..........
我想要通过从tblRandom_Data中选择随机的First_Name值来更新tblCustomers中的所有行的First_Name字段。
我尝试过以下方法,但这会将所有行都更新为选择的相同随机姓氏。并不是所有的值都需要是唯一的,但需要有一些差异。
Update tblCustomers
SET First_Name = (SELECT TOP 1 First_Name
FROM tblRandom_Data
ORDER BY NEWID()
)
英文:
I have a db table which holds random personal details created by a data creation tool.
I want to use the values from this table to update fields in another table in order to mask live personal details when we are troubleshooting issues on another server.
Tables would be as follows:
Table 1: tblRandom_Data
Fields: First_Name | Surname
Table 2: tblCustomers
Fields: CustomerID[GUID] | First_Name | Surname | Address_Line1..........
I want to update the First_Name field on all rows in tblCustomers by selecting a random First_Name value from tblRandom_Data.
I have tried the following but this updates all rows with the same random first name selected. All values do not need to be unique, but there needs to be some differences.
Update tblCustomers
SET First_Name = (SELECT TOP 1 First_Name
FROM tblRandom_Data
ORDER BY NEWID()
)
答案1
得分: 1
你可以使用 apply
:
从 tblCustomers c 中选择 c.* 和 rd.first_name
使用 cross apply
( 从 tblRandom_Data rd 中选择 top (1) rd.first_name
其中 rd.address_line1 <> c.first_name -- 你可以使用任何总是不匹配的列组合
按 newid() 排序
) rd;
英文:
You can use apply
:
select c.*, rd.first_name
from tblCustomers c cross apply
( select top (1) rd.first_name
from tblRandom_Data rd
where rd.address_line1 <> c.first_name -- you can use any column combination that would always be not matched with each other
order by newid()
) rd;
答案2
得分: 1
The problem is that SQL Server "optimizes" the query to only run the subquery once.
我发现最简单的解决方法是使用关联子查询,即使该子查询是无意义的。假设 tblCustomers.First_Name
永远不是 NULL
:
更新 c
将 First_Name 设置为 (选择 TOP 1 First_Name
从 tblRandom_Data rd
其中 rd.First_Name 不等于 c.First_Name
按 NEWID() 排序
)
从 tblCustomers c 中;
另一种方法是通过"循环"这些值:
更新 c
将 First_Name 设置为 rd.First_Name
从 (选择 c.*,
ROW_NUMBER() OVER (按 NEWID() 排序) as seqnum
从 tblCustomers c
) c 加入
(选择 rd.*,
ROW_NUMBER() OVER (按 NEWID() 排序) as seqnum,
COUNT(*) OVER () as cnt
从 tblRandom_Data rd
) rd
在 (c.seqnum % rd.cnt) = rd.seqnum - 1;
这也有助于"平衡"这些名称。因此,如果客户比其他表中的行数多,名称将被分配大致相同的次数。
英文:
The problem is that SQL Server "optimizes" the query to only run the subquery once.
I find that the simplest solution is a correlation clause -- even if that clause is meaningless. Assuming that tblCustomers.First_Name
is never NULL
:
UPDATE c
SET First_Name = (SELECT TOP 1 First_Name
FROM tblRandom_Data rd
WHERE rd.First_Name <> c.First_Name
ORDER BY NEWID()
)
FROM tblCustomers c;
Another method is to "cycle" through the values:
UPDATE c
SET First_Name = rd.First_Name
FROM (SELECT c.*,
ROW_NUMBER() OVER (ORDER BY NEWID()) as seqnum
FROM tblCustomers c
) c JOIN
(SELECT rd.*,
ROW_NUMBER() OVER (ORDER BY NEWID()) as seqnum,
COUNT(*) OVER () as cnt
FROM tblRandom_Data rd
) rd
ON (c.seqnum % rd.cnt) = rd.seqnum - 1;
This also has the benefit of "balancing" the names. So if there are more customers than rows in the other table, the names will be assigned about the same number of times.
答案3
得分: 0
以下是使用 row_number()
和可更新的 CTE 进行操作的一种方法:
with cte as (
select
c.first_name,
c.surname,
r.first_name r_first_name,
r.surname r_surname,
row_number() over(partition by c.customer_id order by newid()) rn
from tblCustomers c
cross join tblRandom_Data r
)
update cte
set first_name = r_first_name, surname = r_surname
where rn = 1
CTE 生成两个表的笛卡尔积,并为具有相同 customer_id
的记录组内的每个记录分配一个随机数。外部查询执行实际更新操作。
示例数据:
select * from tblCustomers;
customer_id | first_name | surname
----------: | :--------- | :------
1 | <em>null</em> | <em>null</em>
2 | <em>null</em> | <em>null</em>
3 | <em>null</em> | <em>null</em>
4 | <em>null</em> | <em>null</em>
select * from tblRandom_Data;
first_name | surname
:--------- | :------
foo | bar
baz | zoo
运行更新后:
select * from tblCustomers;
customer_id | first_name | surname
----------: | :--------- | :------
1 | baz | zoo
2 | foo | bar
3 | baz | zoo
4 | foo | bar
【注意】:在代码中的 HTML 标记(如 <em>
)可能只是用于显示和格式化,并不影响数据库操作。如果需要进一步的解释或修改,请提供更多上下文。
英文:
Here is one way to do it with row_number()
and an updatable cte:
with cte as (
select
c.first_name,
c.surname,
r.first_name r_first_name,
r.surname r_surname,
row_number() over(partition by c.customer_id order by newid()) rn
from tblCustomers c
cross join tblRandom_Data r
)
update cte
set first_name = r_first_name, surname = r_surname
where rn = 1
The cte generates a cartesian produt of both tables and affects a random number to each record within groups having the same customer_id
. The outer query perform the actual update.
Sample data:
select * from tblCustomers;
<pre>
customer_id | first_name | surname |
---|---|---|
1 | <em>null</em> | <em>null</em> |
2 | <em>null</em> | <em>null</em> |
3 | <em>null</em> | <em>null</em> |
4 | <em>null</em> | <em>null</em> |
</pre> |
select * from tblRandom_Data;
<pre>
first_name | surname |
---|---|
foo | bar |
baz | zoo |
</pre> |
After running the update:
select * from tblCustomers;
<pre>
customer_id | first_name | surname |
---|---|---|
1 | baz | zoo |
2 | foo | bar |
3 | baz | zoo |
4 | foo | bar |
</pre> |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论