英文:
TSQL - Find matches from a all values in 1 column in 1 table to compare to table #2 that has a text field
问题
I need to compare the stain_name in table #1 to results column in table #2. I need to return the acc_id from table #2 and stain_name from table #1 if there was a match and insert it into a table called stains_in_report table #3. I'm not able to use a cursor to save the text to a variable because SQL doesn't allow that. Can anyone help me get started with this?
Table #1
id stain_name
1 AA
2 AA Amyloid
3 Adenovirus
4 AE1
5 AE3
Table #2: Results table: (very small data set example) results column is a text field which could have many pages of text in the field.
acc_id results
33022 Patient A had AA stains performed
33137 Patient B had Adenovirus stain performed
33150 Patient C had no stains performed
33175 Patient D had AA stains performed
If there was a match of stain = AA in the results column in table #2; I want to insert stain_name, acc_id into table 3 (stains_in_report)
table #3
stains in report schema
id,
stain_name,
acc_id
expected output from example:
ID stain_name acc_id
1 AA 33022
2 AA 33175
3 Adenovirus 33137
英文:
I need to compare the stain_name in table #1 to results column in table #2. I need to return the acc_id from table #2 and stain_name from table #1 if there was a match and insert it into a table called stains_in_report table #3. I'm not able to use a cursor to save the text to a variable because SQL doesn't allow that. Can anyone help me get started with this?
Table #1
id stain_name
1 AA
2 AA Amyloid
3 Adenovirus
4 AE1
5 AE3
Table #2: Results table: (very small data set example) results column is a text field which could have many pages of text in the field.
acc_id results
33022 Patient A had AA stains performed
33137 Patient B had Adenovirus stain performed
33150 Patient C had no stains performed
33175 Patient D had AA stains performed
If there was a match of stain = AA in the results column in table # 2; I want to insert stain_name, acc_id into table 3 (stains_in_report)
table #3
stains in report schema
id,
stain_name,
acc_id
expected output from example:
ID stain_name acc_id
1 AA 33022
2 AA 33175
3 Adenovirus 33137
答案1
得分: 1
这可能是一个解决方案-要在这么多文本中搜索匹配项,我对其效果持怀疑态度(如果你正在搜索像AT
、IN
或ON
这样的代码,它们可能会匹配所有的结果行),但这可能是你的起点:
drop table if exists #Table1, #Table2
create table #Table1 (id int, stain_name nvarchar(100))
create table #Table2 (acc_id int, results nvarchar(max))
insert into #Table1 values
(1, 'AA'),
(2, 'AA Amyloid'),
(3, 'Adenovirus'),
(4, 'AE1'),
(5, 'AE2')
insert into #Table2 values
(33022, 'bacon round alcatra t-bone aa amyloid hamburger fatback sausage'),
(33137, 'chicken t-bone AE1 ham venison bacon prosciutto'),
(33150, 'bacon brisket pancetta AE2 picanha AA sirloin')
select
t2.acc_id,
t2.results,
t1.id,
t1.stain_name
from
#Table1 t1
inner join #Table2 t2
on t2.results like '%' + t1.stain_name + '%'
结果(如您所见,即使是小样本,也出现了一些问题-即,与AA
匹配的任何内容也会与AA Amyloid
匹配-所以您会保留这两个匹配项,优先哪一个,还是以某种方式将它们合并?如果有三个匹配项或四个匹配项会怎么样?可能会变得混乱):
acc_id | results | id | stain_name
------ | ----------------------------------------------------------- | -- | ----------
33022 | bacon round alcatra t-bone aa amyloid hamburger fatback sausage | 1 | AA
33022 | bacon round alcatra t-bone aa amyloid hamburger fatback sausage | 2 | AA Amyloid
33137 | chicken t-bone AE1 ham venison bacon prosciutto | 4 | AE1
33150 | bacon brisket pancetta AE2 picanha AA sirloin | 1 | AA
33150 | bacon brisket pancetta AE2 picanha AA sirloin | 5 | AE2
英文:
This would be a solution - with so much text to search for matches in, I am skeptical how well it will work (if you have codes like AT
, IN
, or ON
that you are searching for they might well match all your results rows), but it could be a starting point for you:
drop table if exists #Table1, #Table2
create table #Table1 (id int, stain_name nvarchar(100))
create table #Table2 (acc_id int, results nvarchar(max))
insert into #Table1 values
(1, 'AA'),
(2, 'AA Amyloid'),
(3, 'Adenovirus'),
(4, 'AE1'),
(5, 'AE2')
insert into #Table2 values
(33022, 'bacon round alcatra t-bone aa amyloid hamburger fatback sausage'),
(33137, 'chicken t-bone AE1 ham venison bacon prosciutto'),
(33150, 'bacon brisket pancetta AE2 picanha AA sirloin')
select
t2.acc_id,
t2.results,
t1.id,
t1.stain_name
from
#Table1 t1
inner join #Table2 t2
on t2.results like '%' + t1.stain_name + '%'
Results (as you can see, even with a small sample, some problems are surfacing - i.e., anything that matches AA
will also match AA Amyloid
- so would you keep both matches, prioritize one or the other, combine them somehow? What if there are three matches, or four matches? It can get ugly):
acc_id | results | id | stain_name |
---|---|---|---|
33022 | bacon round alcatra t-bone aa amyloid hamburger fatback sausage | 1 | AA |
33022 | bacon round alcatra t-bone aa amyloid hamburger fatback sausage | 2 | AA Amyloid |
33137 | chicken t-bone AE1 ham venison bacon prosciutto | 4 | AE1 |
33150 | bacon brisket pancetta AE2 picanha AA sirloin | 1 | AA |
33150 | bacon brisket pancetta AE2 picanha AA sirloin | 5 | AE2 |
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论