SQL Case语句多重匹配

huangapple go评论58阅读模式
英文:

SQL Case Statement multiple matches

问题

以下是你提供的内容的翻译:

I am trying to make a case statement in SQL for grouping however I am running into an issue where its grouping incorrectly if there's multiple matches.

我正在尝试在SQL中创建一个用于分组的CASE语句,但我遇到了一个问题,如果有多个匹配项,它会错误地进行分组。

Is there a way so it searches word by word as opposed to whole thing and gives results by which word shows up first?

是否有一种方法可以按单词逐个搜索,而不是整个内容,并根据哪个单词首先出现来返回结果?

Subject Examples:

主题示例:

the animal rode in the car --- EXPECTED RESULT Animal

the animal rode in the car --- 期望结果 Animal

in the car the animal rode --- EXPECTED RESULT CAR... actual result Animal

in the car the animal rode --- 期望结果 CAR... 实际结果 Animal

CASE 
WHEN Subject like '%animal%' then 'Animal'
WHEN Subject like '%car%' then 'Car'
END as Category 
CASE
WHEN Subject like '%animal%' then 'Animal'
WHEN Subject like '%car%' then 'Car'
END as Category
英文:

I am trying to make a case statement in SQL for grouping however I am running into an issue where its grouping incorrectly if there's multiple matches.

Is there a way so it searches word by word as opposed to whole thing and gives results by which word shows up first?

Subject Examples:

the animal rode in the car --- EXPECTED RESULT Animal

in the car the animal rode --- EXPECTED RESULT CAR... actual result Animal

CASE 
WHEN Subject like '%animal%' then 'Animal'
WHEN Subject like '%car%' then 'Car'
END as Category 

答案1

得分: 1

使用CASE语句的问题在于整个测试是按顺序执行的。您没有测试哪个先出现。

这可能看起来复杂,但我尝试让它对您来说易于实现。

首先让我们看一下最内部的连接:

  1. 第一个FROM子句查看您要测试的项。这可以转化为SELECT语句。
  2. 接下来是您要查找的值。
  3. 找到位置。

接下来的最内部查询:

  1. 基本上添加了一个行编号,以按最小位置(先出现的位置)对记录进行排序。
  2. 还删除了没有找到匹配项的测试(基本上是位置为0的测试)。

最后,最外部的查询:

  1. 整理您的数据并仅选择首先出现的位置。

结果:

in the car the animal rode	CAR	    8	1
the animal rode in the car	Animal	5   1
英文:

The problem with using a CASE statement is that the entire test is performed in order. You aren't testing which comes first.

This might look complex but I tried to make it easy for implementation for you.

select *
from (
   select test,Category=vals,position, rn = row_number() over (partition by test order by position)
   from(
      select *
      from (values('the animal rode in the car'),('in the car the animal rode')) as tests(test)
	    cross apply (values('CAR'),('Animal')) as Lookfor(vals)
	    cross apply (select CHARINDEX(Lookfor.vals,tests.test)) as p(position)
	) a
    where position>0
) rn
where rn.rn=1

First let's look at the innermost joins:

  1. The first from looks at items you want to test. This can be translated into a SELECT.
  2. Next are the values you want to look up
  3. Finds the position

The next innermost query:

  1. This basically add a row_runmber to sort records by the smallest position (which came first)
  2. Also removed and tests that did not find a match (basically the position of 0).

Finally, the outermost query:

  1. Organizes your data and selects only the position that came first.

Results:

in the car the animal rode	CAR	    8	1
the animal rode in the car	Animal	5   1

答案2

得分: 0

DbFiddle Demo

与其他答案类似,您可以使用charindex来确定任何匹配项的索引并取第一个出现的匹配项。当然,这仅适用于您始终执行like '%something%'比较的情况;如果您的模式更复杂(例如like '%this%and%that%'),charindex将不会有太多帮助。

这种方法将您的规则放入一个名为MatchLogic的表中,允许您轻松构建规则列表而不必修改代码本身。

declare @testdata table (test nvarchar(64))
insert @testdata (test)
values ('the animal rode in the car')
, ('in the car the animal rode')
, ('the dog rode in the the van')

declare @matchLogic table (partialString nvarchar(32), category nvarchar(32))
insert @matchLogic 
values ('animal', 'Animal')
, ('car', 'Car')

select test, (
	select top 1 category 
	from @matchLogic 
	where test like '%' + partialString + '%' 
	order by charindex(partialString, test) asc
) category
from @testData t

另一种方法是从您的字符串匹配1个字符,然后是2个字符,然后是3个字符,依此类推,直到获得第一个匹配项。

注意:

  • 这非常低效,不适用于大型数据集。
  • 这获取第一个完整匹配项;而不是第一个字符匹配的匹配项;例如,查看monkey ... animal ... bicycle示例,其中animal是第一个完整匹配项,即使monkey出现在animal之前,因为只有在获得完整字符串后,monkey ... bicycle才会匹配一次。
declare @testdata table (id bigint identity(1,1),test nvarchar(64))
insert @testdata (test)
values ('the animal rode in the car')
, ('in the car the animal rode')
, ('the monkey rode the bicycle')
, ('the monkey (an animal) rode the bicycle')
, ('the dog rode in the the van')

declare @matchLogic table (pattern nvarchar(32), category nvarchar(32))
insert @matchLogic 
values ('%animal%', 'Animal')
, ('%car%', 'Car')
, ('%monkey%bicycle%', 'MonkeyBike')

; with substrings as (
	select id, test part, len(test) chars
	from @testdata 
	union all 
	select id, substring(part, 1, chars-1), chars-1
	from substrings s
	where chars > 1
)
select t.id, t.test, x.category
from @testData t
outer apply (
	select top 1 category
	from @matchLogic m
	inner join substrings s
	on part like m.pattern
	where s.id = t.id
	order by s.chars asc
) x
英文:

DbFiddle Demo

Similar to other answers, you can use charindex to determine the index of any matches and take whichever comes first. Of course, this only works if you're always doing like '%something%' comparisons; if your pattern is more complex (e.g. like '%this%and%that%') charindex won't be much help.

This approach puts your rules into a table called MatchLogic, allowing you to easily build up your list of rules without amending the code itself.

declare @testdata table (test nvarchar(64))
insert @testdata (test)
values ('the animal rode in the car')
, ('in the car the animal rode')
, ('the dog rode in the the van')

declare @matchLogic table (partialString nvarchar(32), category nvarchar(32))
insert @matchLogic 
values ('animal', 'Animal')
, ('car', 'Car')

select test, (
	select top 1 category 
	from @matchLogic 
	where test like '%' + partialString + '%' 
	order by charindex(partialString, test) asc
) category
from @testData t

Another approach would be to match against 1 character from your string, then 2, then 3, etc until you get your first match.

Note:

  • This is very inefficient, so not suitable for large datasets.
  • This gets the first full match; not the match whose first character comes first; e.g. see the monkey ... animal ... bicycle example where animal is the first complete match even though monkey comes first, as monkey ... bicycle only matches once we have the full string.

declare @testdata table (id bigint identity(1,1),test nvarchar(64))
insert @testdata (test)
values ('the animal rode in the car')
, ('in the car the animal rode')
, ('the monkey rode the bicycle')
, ('the monkey (an animal) rode the bicycle')
, ('the dog rode in the the van')

declare @matchLogic table (pattern nvarchar(32), category nvarchar(32))
insert @matchLogic 
values ('%animal%', 'Animal')
, ('%car%', 'Car')
, ('%monkey%bicycle%', 'MonkeyBike')

; with substrings as (
	select id, test part, len(test) chars
	from @testdata 
	union all 
	select id, substring(part, 1, chars-1), chars-1
	from substrings s
	where chars > 1
)
select t.id, t.test, x.category
from @testData t
outer apply (
	select top 1 category
	from @matchLogic m
	inner join substrings s
	on part like m.pattern
	where s.id = t.id
	order by s.chars asc
) x

答案3

得分: -1

你关于它如何检查的理解是错误的。首先将检查第一个 when 子句,如果匹配,第二个将不会被评估。你的两个示例都匹配第一个 when 子句,所以触发的是这个。

这应该给你想要的结果:

case 
 when Subject  like '%animal%' and (Subject not like '%car%' OR  CHARINDEX('animal',Subject,1) <= CHARINDEX('car',Subject,1)) then 'Animal'
 when Subject  like  '%car%'  and (Subject not like '%animal%' OR  CHARINDEX('car',Subject,1) <= CHARINDEX('animal',Subject,1))  then 'Car' 
 else 'UNKNOWN'
 end AS Category,
英文:

You are mistaken on how it checks. The first when clause will get checked first, if it matches, the second one will not get evaluated. Both of your examples match the first when, so that is what is getting triggered.

This should give you what you want:

case 
 when Subject  like &#39;%animal%&#39; and (Subject not like &#39;%car%&#39; OR  CHARINDEX(&#39;animal&#39;,Subject,1) &lt;= CHARINDEX(&#39;car&#39;,Subject,1)) then &#39;Animal&#39;
 when Subject  like  &#39;%car%&#39;  and (Subject not like &#39;%animal%&#39; OR  CHARINDEX(&#39;car&#39;,Subject,1) &lt;= CHARINDEX(&#39;animal&#39;,Subject,1))  then &#39;Car&#39; 
 else &#39;UNKNOWN&#39;
 end AS Category,

huangapple
  • 本文由 发表于 2023年2月7日 04:36:53
  • 转载请务必保留本文链接:https://go.coder-hub.com/75366319.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定