TSQL – Find matches from a all values in 1 column in 1 table to compare to table #2 that has a text field

huangapple go评论62阅读模式
英文:

TSQL - Find matches from a all values in 1 column in 1 table to compare to table #2 that has a text field

问题

I need to compare the stain_name in table #1 to results column in table #2. I need to return the acc_id from table #2 and stain_name from table #1 if there was a match and insert it into a table called stains_in_report table #3. I'm not able to use a cursor to save the text to a variable because SQL doesn't allow that. Can anyone help me get started with this?

Table #1

id stain_name
1 AA
2 AA Amyloid
3 Adenovirus
4 AE1
5 AE3

Table #2: Results table: (very small data set example) results column is a text field which could have many pages of text in the field.

acc_id results
33022 Patient A had AA stains performed
33137 Patient B had Adenovirus stain performed
33150 Patient C had no stains performed
33175 Patient D had AA stains performed

If there was a match of stain = AA in the results column in table #2; I want to insert stain_name, acc_id into table 3 (stains_in_report)

table #3
stains in report schema
id,
stain_name,
acc_id

expected output from example:

ID stain_name acc_id
1 AA 33022
2 AA 33175
3 Adenovirus 33137

英文:

I need to compare the stain_name in table #1 to results column in table #2. I need to return the acc_id from table #2 and stain_name from table #1 if there was a match and insert it into a table called stains_in_report table #3. I'm not able to use a cursor to save the text to a variable because SQL doesn't allow that. Can anyone help me get started with this?

Table #1

id      stain_name
1	AA
2	AA Amyloid
3	Adenovirus
4	AE1
5	AE3

Table #2: Results table: (very small data set example) results column is a text field which could have many pages of text in the field.

acc_id  results 
33022	Patient A had AA stains performed
33137	Patient B had Adenovirus stain performed
33150	Patient C had no stains performed
33175   Patient D had AA stains performed

If there was a match of stain = AA in the results column in table # 2; I want to insert stain_name, acc_id into table 3 (stains_in_report)

table #3
stains in report schema
id,
stain_name,
acc_id

expected output from example:

ID    stain_name    acc_id
1      AA            33022	
2      AA            33175
3      Adenovirus    33137	

答案1

得分: 1

这可能是一个解决方案-要在这么多文本中搜索匹配项,我对其效果持怀疑态度(如果你正在搜索像ATINON这样的代码,它们可能会匹配所有的结果行),但这可能是你的起点:

drop table if exists #Table1, #Table2
create table #Table1 (id int, stain_name nvarchar(100))
create table #Table2 (acc_id int, results nvarchar(max))

insert into #Table1 values 
    (1, 'AA'),
    (2, 'AA Amyloid'),
    (3, 'Adenovirus'),
    (4, 'AE1'),
    (5, 'AE2')

insert into #Table2 values 
    (33022, 'bacon round alcatra t-bone aa amyloid hamburger fatback sausage'),
    (33137, 'chicken t-bone AE1 ham venison bacon prosciutto'),
    (33150, 'bacon brisket pancetta AE2 picanha AA sirloin')

select 
    t2.acc_id,
    t2.results,
    t1.id,
    t1.stain_name 
from 
    #Table1 t1
    inner join #Table2 t2
on t2.results like '%' + t1.stain_name + '%'

结果(如您所见,即使是小样本,也出现了一些问题-即,与AA匹配的任何内容也会与AA Amyloid匹配-所以您会保留这两个匹配项,优先哪一个,还是以某种方式将它们合并?如果有三个匹配项或四个匹配项会怎么样?可能会变得混乱):

acc_id | results                                                     | id | stain_name
------ | ----------------------------------------------------------- | -- | ----------
33022  | bacon round alcatra t-bone aa amyloid hamburger fatback sausage | 1  | AA
33022  | bacon round alcatra t-bone aa amyloid hamburger fatback sausage | 2  | AA Amyloid
33137  | chicken t-bone AE1 ham venison bacon prosciutto             | 4  | AE1
33150  | bacon brisket pancetta AE2 picanha AA sirloin               | 1  | AA
33150  | bacon brisket pancetta AE2 picanha AA sirloin               | 5  | AE2
英文:

This would be a solution - with so much text to search for matches in, I am skeptical how well it will work (if you have codes like AT, IN, or ON that you are searching for they might well match all your results rows), but it could be a starting point for you:

drop table if exists #Table1, #Table2
create table #Table1 (id int, stain_name nvarchar(100))
create table #Table2 (acc_id int, results nvarchar(max))

insert into #Table1 values 
	(1, 'AA'),
	(2, 'AA Amyloid'),
	(3, 'Adenovirus'),
	(4, 'AE1'),
	(5, 'AE2')

insert into #Table2 values 
	(33022, 'bacon round alcatra t-bone aa amyloid hamburger fatback sausage'),
	(33137, 'chicken t-bone AE1 ham venison bacon prosciutto'),
	(33150, 'bacon brisket pancetta AE2 picanha AA sirloin')

select 
	t2.acc_id,
	t2.results,
	t1.id,
	t1.stain_name 
from 
	#Table1 t1
	inner join #Table2 t2
on t2.results like '%' + t1.stain_name + '%'

Results (as you can see, even with a small sample, some problems are surfacing - i.e., anything that matches AA will also match AA Amyloid - so would you keep both matches, prioritize one or the other, combine them somehow? What if there are three matches, or four matches? It can get ugly):

acc_id results id stain_name
33022 bacon round alcatra t-bone aa amyloid hamburger fatback sausage 1 AA
33022 bacon round alcatra t-bone aa amyloid hamburger fatback sausage 2 AA Amyloid
33137 chicken t-bone AE1 ham venison bacon prosciutto 4 AE1
33150 bacon brisket pancetta AE2 picanha AA sirloin 1 AA
33150 bacon brisket pancetta AE2 picanha AA sirloin 5 AE2

huangapple
  • 本文由 发表于 2023年6月9日 00:03:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76433755.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定