2023年2月10日 03:17:49go评论65阅读模式

英文:

INITCAP in a query in SQL Server

问题

使用SQL Server 2016，我需要以特定方式清除空格并实现INITCAP。

空格清除程序很简单。我在正确实现INITCAP替代时遇到了问题。

https://stackoverflow.com/questions/26012822/initcap-equivalent-in-mssql 上的接受答案是错误的，正如第一条评论中所指出的那样。

我的数据包含具有连续多个空格和特殊字符（&、%等）的值。

stuff(): 在SQL Server 2016中，string_split没有提供用于验证序数值的选项，也不能保证结果以特定顺序返回。因此，我需要编写代码来确保从split_string返回的值按正确顺序排列。

convert(xml,...): 解码大多数XML编码的值。

convert(varchar(max),...): ...因为XML无法在需要SELECT DISTINCT时使用

SQL Fiddle

with T as (
  select *
  from (
  values ('Minesota Mining and   Manufacturing')
  , ('Minesota Mining &amp; Manufacturing   ')
  , (' tillamook')
  , ('MUTUAL OF OMAHA')
  , ('   ')
  ) q(s)
),
scrubbed as (
  select T.s as InitialValue
  , CASE 
      WHEN LEN(RTRIM(T.s)) > 0
        THEN LTRIM(RTRIM(T.s))
    END as s
  from T
)
select distinct s.InitialValue
, stuff(
    (
      SELECT ' ' + t2.word
      from (
          select str.value
          , upper(substring(str.value, 1, 1)) + 
            case when len(str.value) > 1 then lower(substring(str.value, 2, len(str.value) - 1)) else '' end as word
          , charindex(' ' + str.value + ' ', ' ' + s.s + ' ') as idx
          from string_split(s.s, ' ') str
        ) t2
      order by t2.idx
      FOR XML PATH('')
    ), 
    1, 
    1, 
    ''
  ) as INITCAP_xml
, convert(
    varchar(max), 
    convert(
      xml, 
      stuff(
        (
          SELECT ' ' + t2.word
          from (
              select str.value
              , upper(substring(str.value, 1, 1)) + 
                case when len(str.value) > 1 then lower(substring(str.value, 2, len(str.value) - 1)) else '' end as word
              , charindex(' ' + str.value + ' ', ' ' + s.s + ' ') as idx
              from string_split(s.s, ' ') str
            ) t2
          order by t2.idx
          FOR XML PATH('')
        ), 
        1, 
        1, 
        ''
      )
    )
  ) as INITCAP_decoded
from scrubbed s

您可以看到，在使用FOR XML时，一些字符会被编码（如[space] = $#x20;和& = &amp;）。通过转换为XML数据类型，一些字符将被解码。但一些字符（如&amp;）仍然保持编码。

InitialValue	INITCAP_attempt1	INITCAP_xml	INITCAP_decoded
`Minesota Mining and Manufacturing`	`Minesota Mining And&#x20;&#x20; Manufacturing`	`Minesota Mining And Manufacturing`	`Minesota Mining And Manufacturing`
`Minesota Mining & Manufacturing`	`Minesota Mining &amp; Manufacturing`	`Minesota Mining &amp; Manufacturing`	`Minesota Mining &amp; Manufacturing`
`tillamook`	`Tillamook`	`Tillamook`	`Tillamook`
`MUTUAL OF OMAHA`	`Mutual Of Omaha`	`Mutual Of Omaha`	`Mutual Of Omaha`
	null	null	null

REPLACE(s, '&amp;', '&') 似乎不是一个合理的选项，因为我不知道随着时间的推移会遇到什么其他值。是否有一种处理由FOR XML编码的字符的通用方法？

在视图中（因此不使用用户定义函数或存储过程），是否有更好的方法来实现SQL Server中的INITCAP？

英文:

Using SQL Server 2016, I have a need to scrub white space a certain way and implement INITCAP.

The whitespace scrubber is simple. I'm having trouble getting the INITCAP replacement working properly.

The accepted answer to https://stackoverflow.com/questions/26012822/initcap-equivalent-in-mssql is wrong, as noted in the first comment.

My data contains values that have multiple spaces in a row and special characters, (&, %, etc.).

stuff(): In SQL Server 2016, string_split does not have an option to prove an ordinal value and does not guarantee that the results are returned in any specific order. So, I need to write code to ensure values are returned from split_string in the correct order.

convert(xml,...): Decodes most of the XML-encoded values.

convert(varchar(max),...): ...because XML can't be used when needing SELECT DISTINCT

SQL Fiddle

with T as (
select *
from (
values (&#39;Minesota Mining and   Manufacturing&#39;)
, (&#39;Minesota Mining &amp; Manufacturing   &#39;)
, (&#39; tillamook&#39;)
, (&#39;MUTUAL OF OMAHA&#39;)
, (&#39;   &#39;)
) q(s)
),
scrubbed as (
select T.s as InitialValue
, CASE 
WHEN LEN(RTRIM(T.s)) &gt; 0
THEN LTRIM(RTRIM(T.s))
END as s
from T
)
select distinct s.InitialValue
, stuff(
(
SELECT &#39; &#39; + t2.word
from (
select str.value
, upper(substring(str.value, 1, 1)) + 
case when len(str.value) &gt; 1 then lower(substring(str.value, 2, len(str.value) - 1)) else &#39;&#39; end as word
, charindex(&#39; &#39; + str.value + &#39; &#39;, &#39; &#39; + s.s + &#39; &#39;) as idx
from string_split(s.s, &#39; &#39;) str
) t2
order by t2.idx
FOR XML PATH(&#39;&#39;)
), 
1, 
1, 
&#39;&#39;
) as INITCAP_xml
, convert(
varchar(max), 
convert(
xml, 
stuff(
(
SELECT &#39; &#39; + t2.word
from (
select str.value
, upper(substring(str.value, 1, 1)) + 
case when len(str.value) &gt; 1 then lower(substring(str.value, 2, len(str.value) - 1)) else &#39;&#39; end as word
, charindex(&#39; &#39; + str.value + &#39; &#39;, &#39; &#39; + s.s + &#39; &#39;) as idx
from string_split(s.s, &#39; &#39;) str
) t2
order by t2.idx
FOR XML PATH(&#39;&#39;)
), 
1, 
1, 
&#39;&#39;
)
)
) as INITCAP_decoded
from scrubbed s

You see in the output that using FOR XML causes some of the characters to be encoded (like [space] = $#x20; and & = &amp;). By converting to XML data type, some of those characters are decoded. But some characters (like &amp;) remain encoded.

InitialValue	INITCAP_attempt1	INITCAP_xml	INITCAP_decoded
`Minesota Mining and Manufacturing`	`Minesota Mining And&#x20;&#x20; Manufacturing`	`Minesota Mining And Manufacturing`	`Minesota Mining And Manufacturing`
`Minesota Mining & Manufacturing`	`Minesota Mining &amp; Manufacturing`	`Minesota Mining &amp; Manufacturing`	`Minesota Mining &amp; Manufacturing`
`tillamook`	`Tillamook`	`Tillamook`	`Tillamook`
`MUTUAL OF OMAHA`	`Mutual Of Omaha`	`Mutual Of Omaha`	`Mutual Of Omaha`
	null	null	null

REPLACE(s, '&amp;', '&') doesn't seem like a reasonable option because I don't know what other values I'll run into over time. Is there a good, general way to handle characters that will be encoded by FOR XML?

Within a view (so, without using user defined functions or stored procedures), is there a better way to implement INITCAP in SQL Server?

答案1

得分: 2

以下是代码部分的翻译：

select *
       ,[dbo].[svf-Str-Proper] (S)
  from (
  values ('Minesota Mining and   Manufacturing')
  , ('Minesota Mining &amp; Manufacturing   ')
  , (' tillamook')
  , ('MUTUAL OF OMAHA')
  , ('   ')
  ) q(s)

Results

s                                   	(No column name)
Minesota Mining and   Manufacturing 	Minesota Mining And Manufacturing
Minesota Mining &amp; Manufacturing     	Minesota Mining &amp; Manufacturing
 tillamook	                            Tillamook
MUTUAL OF OMAHA	                        Mutual Of Omaha

**The Function if Iterested**

CREATE FUNCTION [dbo].[svf-Str-Proper] (@S varchar(max))
Returns varchar(max)
As
Begin
    Set @S = ' '+ltrim(rtrim(replace(replace(replace(lower(@S),' ','†‡'),'‡†',''),'†‡',' ')))+' ';
    ;with cte1 as (Select * From (Values(' '),('-'),('/'),('\'),('['),('{'),('('),('.'),(','),('&'),(' Mc'),(' Mac'),(' O''''') ) A(P))
         ,cte2 as (Select * From (Values('A'),('B'),('C'),('D'),('E'),('F'),('G'),('H'),('I'),('J'),('K'),('L'),('M')
                                        ,('N'),('O'),('P'),('Q'),('R'),('S'),('T'),('U'),('V'),('W'),('X'),('Y'),('Z')
                                        ,('LLC'),('PhD'),('MD'),('DDS'),('II'),('III'),('IV')
                                     ) A(S))
         ,cte3 as (Select F = Lower(A.P+B.S),T = A.P+B.S From cte1 A Cross Join cte2 B 
                   Union All 
                   Select F = Lower(B.S+A.P),T = B.S+A.P From cte1 A Cross Join cte2 B where A.P in ('&') 
                  ) 
    Select @S = replace(@S,F,T) From cte3 
    Return rtrim(ltrim(@S))
End
-- Syntax : Select [dbo].[svf-Str-Proper]('john cappelletti')
--          Select [dbo].[svf-Str-Proper]('james e. o''''neil')
--          Select [dbo].[svf-Str-Proper]('CAPPELLETTI II,john old macdonald iv phd,dds llc b&o railroad bank-one at&t BD&I Bank-Five dr. Langdon,dds')

如果还有其他需要翻译的内容，请提供具体文本。

英文:

If interested in a SVF, here is a scaled down version which allows customization and edge events. For example rather than Phd, you would get PhD ... MacDonald, O'Neil

This is a dramatically scaled down version.. My rules/exceptions are in a generic mapping table.

Example

 select *
,[dbo].[svf-Str-Proper] (S)
from (
values (&#39;Minesota Mining and   Manufacturing&#39;)
, (&#39;Minesota Mining &amp; Manufacturing   &#39;)
, (&#39; tillamook&#39;)
, (&#39;MUTUAL OF OMAHA&#39;)
, (&#39;   &#39;)
) q(s)

Results

s                                   	(No column name)
Minesota Mining and   Manufacturing 	Minesota Mining And Manufacturing
Minesota Mining &amp; Manufacturing     	Minesota Mining &amp; Manufacturing
tillamook	                            Tillamook
MUTUAL OF OMAHA	                        Mutual Of Omaha

The Function if Iterested

CREATE FUNCTION [dbo].[svf-Str-Proper] (@S varchar(max))
Returns varchar(max)
As
Begin
Set @S = &#39; &#39;+ltrim(rtrim(replace(replace(replace(lower(@S),&#39; &#39;,&#39;†‡&#39;),&#39;‡†&#39;,&#39;&#39;),&#39;†‡&#39;,&#39; &#39;)))+&#39; &#39;
;with cte1 as (Select * From (Values(&#39; &#39;),(&#39;-&#39;),(&#39;/&#39;),(&#39;\&#39;),(&#39;[&#39;),(&#39;{&#39;),(&#39;(&#39;),(&#39;.&#39;),(&#39;,&#39;),(&#39;&amp;&#39;),(&#39; Mc&#39;),(&#39; Mac&#39;),(&#39; O&#39;&#39;&#39;) ) A(P))
,cte2 as (Select * From (Values(&#39;A&#39;),(&#39;B&#39;),(&#39;C&#39;),(&#39;D&#39;),(&#39;E&#39;),(&#39;F&#39;),(&#39;G&#39;),(&#39;H&#39;),(&#39;I&#39;),(&#39;J&#39;),(&#39;K&#39;),(&#39;L&#39;),(&#39;M&#39;)
,(&#39;N&#39;),(&#39;O&#39;),(&#39;P&#39;),(&#39;Q&#39;),(&#39;R&#39;),(&#39;S&#39;),(&#39;T&#39;),(&#39;U&#39;),(&#39;V&#39;),(&#39;W&#39;),(&#39;X&#39;),(&#39;Y&#39;),(&#39;Z&#39;)
,(&#39;LLC&#39;),(&#39;PhD&#39;),(&#39;MD&#39;),(&#39;DDS&#39;),(&#39;II&#39;),(&#39;III&#39;),(&#39;IV&#39;)
) A(S))
,cte3 as (Select F = Lower(A.P+B.S),T = A.P+B.S From cte1 A Cross Join cte2 B 
Union All 
Select F = Lower(B.S+A.P),T = B.S+A.P From cte1 A Cross Join cte2 B where A.P in (&#39;&amp;&#39;) 
) 
Select @S = replace(@S,F,T) From cte3 
Return rtrim(ltrim(@S))
End
-- Syntax : Select [dbo].[svf-Str-Proper](&#39;john cappelletti&#39;)
--          Select [dbo].[svf-Str-Proper](&#39;james e. o&#39;&#39;neil&#39;)
--          Select [dbo].[svf-Str-Proper](&#39;CAPPELLETTI II,john old macdonald iv phd,dds llc b&amp;o railroad bank-one at&amp;t BD&amp;I Bank-Five dr. Langdon,dds&#39;)

答案2

得分: 1

请尝试以下解决方案。

它使用了SQL Server XML、XQuery和其FLWOR表达式。

值得注意的是：

cast as xs:token? 处理了空白字符，即：
- 所有不可见的制表符、回车和换行符将被替换为空格。
- 然后从值中删除前导和尾随空格。
- 此外，多于一个空格的连续出现将被替换为一个空格。
FLWOR表达式确保了正确的大小写。

SQL

-- DDL和示例数据填充，开始
DECLARE @tbl TABLE (tokens VARCHAR(MAX));
INSERT @tbl (tokens) VALUES
('mineSota Mining and   MaNufacturing'),
('Minesota Mining &amp; Manufacturing   '),
('tillamook'),
('MUTUAL  OF   OMAHA'),
('   ');
-- DDL和示例数据填充，结束
DECLARE @separator CHAR(1) = SPACE(1);
SELECT t.*, scrubbed
, result = c.query('
for $x in /root/r/text()
return concat(upper-case(substring($x,1,1)),lower-case(substring($x,2,1000)))
').value('text()[1]', 'VARCHAR(MAX)')
FROM @tbl AS t
CROSS APPLY (SELECT TRY_CAST('<r><![CDATA[' + tokens + ' ' + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)')) AS t1(scrubbed)
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
REPLACE(scrubbed, @separator, ']]></r><r><![CDATA[') + 
']]></r></root>' AS XML)) AS t2(c);

输出

tokens	scrubbed	result
mineSota Mining and MaNufacturing	mineSota Mining and MaNufacturing	Minesota Mining And Manufacturing
Minesota Mining & Manufacturing	Minesota Mining & Manufacturing	Minesota Mining & Manufacturing
tillamook	tillamook	Tillamook
MUTUAL OF OMAHA	MUTUAL OF OMAHA	Mutual Of Omaha
NULL

英文:

Please try the following solution.

It is using SQL Server XML, XQuery, and its FLWOR expression.

Notable points:

cast as xs:token? is taking care of the whitespaces, i.e:
- All invisible TAB, Carriage Return, and Line Feed characters will be
  replaced with spaces.
- Then leading and trailing spaces are removed from the value.
- Further, contiguous occurrences of more than one space will be replaced with a single space.
FLWOR expression is taking care of a proper case.

SQL

-- DDL and sample data population, start
DECLARE @tbl TABLE (tokens VARCHAR(MAX));
INSERT @tbl (tokens) VALUES
(&#39;mineSota Mining and   MaNufacturing&#39;),
(&#39;Minesota Mining &amp; Manufacturing   &#39;),
(&#39; tillamook&#39;),
(&#39;MUTUAL  OF   OMAHA&#39;),
(&#39;   &#39;);
-- DDL and sample data population, end
DECLARE @separator CHAR(1) = SPACE(1);
SELECT t.*, scrubbed
, result = c.query(&#39;
for $x in /root/r/text()
return concat(upper-case(substring($x,1,1)),lower-case(substring($x,2,1000)))
&#39;).value(&#39;text()[1]&#39;, &#39;VARCHAR(MAX)&#39;)
FROM @tbl AS t
CROSS APPLY (SELECT TRY_CAST(&#39;&lt;r&gt;&lt;![CDATA[&#39; + tokens + &#39; &#39; + &#39;]]&gt;&lt;/r&gt;&#39; AS XML).value(&#39;(/r/text())[1] cast as xs:token?&#39;,&#39;VARCHAR(MAX)&#39;)) AS t1(scrubbed)
CROSS APPLY (SELECT TRY_CAST(&#39;&lt;root&gt;&lt;r&gt;&lt;![CDATA[&#39; + 
REPLACE(scrubbed, @separator, &#39;]]&gt;&lt;/r&gt;&lt;r&gt;&lt;![CDATA[&#39;) + 
&#39;]]&gt;&lt;/r&gt;&lt;/root&gt;&#39; AS XML)) AS t2(c);

Output

tokens	scrubbed	result
mineSota Mining and MaNufacturing	mineSota Mining and MaNufacturing	Minesota Mining And Manufacturing
Minesota Mining & Manufacturing	Minesota Mining & Manufacturing	Minesota Mining & Manufacturing
tillamook	tillamook	Tillamook
MUTUAL OF OMAHA	MUTUAL OF OMAHA	Mutual Of Omaha
NULL

答案3

得分: 0

以下是已经翻译好的部分：

You've made the classic SQL Server XML mistake, one cannot just use the PATH('').
你犯了经典的SQL Server XML错误，不能只是使用PATH('')。

You have to do the convoluted PATH(''), TYPE).value('.', 'NVARCHAR(MAX)') thing to get proper encoded characters.
你必须执行复杂的PATH(''), TYPE).value('.', 'NVARCHAR(MAX)')操作才能获得正确编码的字符。

Here's your fixed version:
以下是已修复的版本：

with T as (
  select *
  from (
  values ('Minesota Mining and   Manufacturing')
  , ('Minesota Mining &amp; Manufacturing   ')
  , (' tillamook')
  , ('MUTUAL OF OMAHA')
  , ('   ')
  ) q(s)
),
scrubbed as (
  select T.s as InitialValue
  , CASE 
      WHEN LEN(RTRIM(T.s)) > 0
        THEN LTRIM(RTRIM(T.s))
    END as s
  from T
)
select distinct s.InitialValue
, stuff(
    (
      SELECT ' ' + t2.word
      from (
          select str.value
          , upper(substring(str.value, 1, 1)) + 
            case when len(str.value) > 1 then lower(substring(str.value, 2, len(str.value) - 1)) else '' end as word
          , charindex(' ' + str.value + ' ', ' ' + s.s + ' ') as idx
          from string_split(s.s, ' ') str
        ) t2
      order by t2.idx
      FOR XML PATH(''), TYPE
    ).value('.', 'NVARCHAR(MAX)'), 
    1, 
    1, 
    ''
  ) as INITCAP
from scrubbed s

请注意，我已经保留了SQL代码的原文，只翻译了需要翻译的部分。

英文:

You've made the classic SQL Server XML mistake, one cannot just use the PATH('').
You have to do the convuluted PATH(''), TYPE).value('.', 'NVARCHAR(MAX)') thing to get proper encoded characters.

Here's your fixed version:

with T as (
  select *
  from (
  values (&#39;Minesota Mining and   Manufacturing&#39;)
  , (&#39;Minesota Mining &amp; Manufacturing   &#39;)
  , (&#39; tillamook&#39;)
  , (&#39;MUTUAL OF OMAHA&#39;)
  , (&#39;   &#39;)
  ) q(s)
),
scrubbed as (
  select T.s as InitialValue
  , CASE 
      WHEN LEN(RTRIM(T.s)) &gt; 0
        THEN LTRIM(RTRIM(T.s))
    END as s
  from T
)
select distinct s.InitialValue
, stuff(
    (
      SELECT &#39; &#39; + t2.word
      from (
          select str.value
          , upper(substring(str.value, 1, 1)) + 
            case when len(str.value) &gt; 1 then lower(substring(str.value, 2, len(str.value) - 1)) else &#39;&#39; end as word
          , charindex(&#39; &#39; + str.value + &#39; &#39;, &#39; &#39; + s.s + &#39; &#39;) as idx
          from string_split(s.s, &#39; &#39;) str
        ) t2
      order by t2.idx
      FOR XML PATH(&#39;&#39;), TYPE
    ).value(&#39;.&#39;, &#39;NVARCHAR(MAX)&#39;), 
    1, 
    1, 
    &#39;&#39;
  ) as INITCAP
from scrubbed s

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

INITCAP在SQL Server查询中的用法

问题

答案1

答案2

答案3

“No overload for ‘Click_Edit’ matches delegate ‘EventHandler'”

重新分组数据以便求和

在go-gorm中，”mssql: Invalid column name ‘id'”的意思是”mssql: 无效的列名’id'”。

Spring搜索功能在未输入任何搜索内容时返回完整列表。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论