在一个varchar列中,移除前导零和特殊字符后面的字符存在问题。

huangapple go评论79阅读模式
英文:

Having issues with removing preceding zeroes and characters after special characters from a varchar column

问题

I am trying to remove alphabets, space from left, right and in between, preceding zeros, and all the characters after special characters (comma, semicolon, backslash, forward slash).

我尝试去除字母,左右两边的空格,以及特殊字符(逗号、分号、反斜杠、正斜杠)后面的所有字符。

I have a column that contains value like the below :

我有一个包含以下值的列:

051 765 999
MK00564123
KJ786453425/9
432563542/3
096745632
53452BMG441,8

I am trying to remove alphabets and space from left and right from the value for those values containing alphabets in the value and also preceding 0.

我尝试从这些值中删除包含字母的值的左右两边的字母和空格,以及前导的0。

Expected Output:

预期输出:

51765999
564123
96745632

Also, I am trying to remove everything after a special character in the column (comma, semicolon, backslash, forward slash):

另外,我尝试删除列中特殊字符(逗号、分号、反斜杠、正斜杠)后面的所有内容:

Expected Output:

预期输出:

786453425
432563542
53452441

Final Output:

最终输出:

51765999
564123
96745632
786453425
432563542
53452441

I have created the fiddle but not getting the expected output as shown below:

我已经创建了示例,但是没有得到如下所示的预期输出:

在一个varchar列中,移除前导零和特殊字符后面的字符存在问题。

Fiddle: http://sqlfiddle.com/#!18/0b383/1123

Can someone please help me to identify what is the issue and how can I get the expected output?

请问有人可以帮助我识别问题所在以及如何获得预期的输出吗?

英文:

I am trying to remove alphabets, space from left, right and in between, preceding zeros, and all the characters after special characters(comma,semicolon, backslash, forward slash)

I have a column that contains value like the below :

051 765 999
MK00564123
KJ786453425/9
432563542/3
096745632
53452BMG441,8

I am trying to remove alphabets and space from left and right from the value for those values containing alphabets in the value and also preceding 0.

Expected Output:

51765999
564123
96745632

Also, I am trying to remove everything after a special character in the column(comma,semicolon, backslash, forward slash):

Expected Output:

786453425
432563542
53452441

Final Output:

51765999
564123
96745632
786453425
432563542
53452441

I have created the fiddle but not getting the expected output as shown below:

在一个varchar列中,移除前导零和特殊字符后面的字符存在问题。

Fiddle: http://sqlfiddle.com/#!18/0b383/1123

Can someone please help me to identify what is the issue and how can I get the expected output?

答案1

得分: 1

以下是翻译好的内容:

要从数字字符串中修剪前导零:

REPLACE(LTRIM(REPLACE('00123', '0', ' ')), ' ', '0') -- 返回 '123'

将你的返回更改为:

RETURN REPLACE(LTRIM(REPLACE(ISNULL(@strAlphaNumeric,0), '0', ' ')), ' ', '0')

查看在线演示

这通过将所有的 0 更改为空格,左修剪,然后将所有的空格改回 0 来实现。


当然,如果SQL Server支持 REGEXP_REPLACE(),那将会非常简单:

REGEXP_REPLACE('00123', '^0+', '') -- 将返回 '123'
英文:

To trim leading zeroes from a numeric string:

REPLACE(LTRIM(REPLACE('00123', '0', ' ')), ' ', '0') -- returns '123'

Change your return to:

RETURN REPLACE(LTRIM(REPLACE(ISNULL(@strAlphaNumeric,0), '0', ' ')), ' ', '0')

See live demo.

This works by changing all 0 to a space, left trimming, then changing all space back to 0.


Of course if SQL Server ever gets around to supporting REGEXP_REPLACE() it would be trivial:

REGEXP_REPLACE('00123', '^0+', '') -- would return '123'

答案2

得分: 1

这是一种方法:

CREATE FUNCTION dbo.udf_GetCleanedAlphaNumeric (@alphanumeric VARCHAR(100))
RETURNS TABLE
AS
    RETURN
      (SELECT SUBSTRING(cleaned, PATINDEX('%[^0]%', cleaned + 'x'), 8000) AS Final
       FROM   (VALUES (SUBSTRING(@alphanumeric, 1, PATINDEX('%[,;\/]%', @alphanumeric + ';') - 1))) ca1(prefix)
              CROSS APPLY (VALUES ( REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                    REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
                                    REPLACE(REPLACE(prefix, ' ', ''), 'a', '')
                                    , 'b', ''), 'c', ''), 'd', ''), 'e', ''), 'f', '')
                                    , 'g', ''), 'h', ''), 'i', ''), 'j', ''), 'k', '')
                                    , 'l', ''), 'm', ''), 'n', ''), 'o', ''), 'p', '')
                                    , 'q', ''), 'r', ''), 's', ''), 't', ''), 'u', '')
                                    , 'v', ''), 'w', ''), 'x', ''), 'y', ''), 'z', '') )) ca2(cleaned)) 

然后:

SELECT alphanumeric,
       Final
FROM   temp
CROSS APPLY dbo.udf_GetCleanedAlphaNumeric(alphanumeric)
  • 首先提取出在任何,;\/之前的子字符串(如果存在)
  • 然后去除所有字母或空格(在后续版本中可以使用TRANSLATE函数更简洁地实现,如编辑历史中所示)
  • 然后移除前导零

DB Fiddle

英文:

One way of doing this is

CREATE FUNCTION dbo.udf_GetCleanedAlphaNumeric (@alphanumeric VARCHAR(100))
RETURNS TABLE
AS
    RETURN
      (SELECT SUBSTRING(cleaned, PATINDEX('%[^0]%', cleaned + 'x'), 8000) AS Final
       FROM   (VALUES (SUBSTRING(@alphanumeric, 1, PATINDEX('%[,;\/]%', @alphanumeric + ';') - 1))) ca1(prefix)
              CROSS APPLY (VALUES ( REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
			                        REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
									REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
									REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
									REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
									REPLACE(REPLACE(prefix, ' ', ''), 'a', '')
									, 'b', ''), 'c', ''), 'd', ''), 'e', ''), 'f', '')
									, 'g', ''), 'h', ''), 'i', ''), 'j', ''), 'k', '')
									, 'l', ''), 'm', ''), 'n', ''), 'o', ''), 'p', '')
									, 'q', ''), 'r', ''), 's', ''), 't', ''), 'u', '')
									, 'v', ''), 'w', ''), 'x', ''), 'y', ''), 'z', '') )) ca2(cleaned)) 

and then

SELECT alphanumeric,
       Final
FROM   temp
CROSS APPLY dbo.udf_GetCleanedAlphaNumeric(alphanumeric)
  • First extract the substring before any ,;\/ (if present)
  • Then strip out any letters or spaces (can be made much more concise on later versions with TRANSLATE function as in edit history)
  • Then remove leading zeroes

(DB Fiddle)

答案3

得分: 1

此函数可以轻松内联。

CREATE FUNCTION dbo.udf_GetNumeric
(@strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
declare @i bigint
set @strAlphaNumeric = replace(TRANSLATE(@strAlphaNumeric, 'abcdefghijklmnopqrstuvwxyz /\,;', replicate('a', 27)+replicate(';',4)),'a','')
set @i=charindex(';',@strAlphaNumeric)
if @i>0
   set @strAlphaNumeric = left(@strAlphaNumeric,@i-1)
RETURN replace(LTRIM(replace(ISNULL(@strAlphaNumeric,0), '0', ' ')), ' ', '0')
END
英文:

This function can be easily inlined.

CREATE FUNCTION dbo.udf_GetNumeric
(@strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
declare @i bigint
set @strAlphaNumeric = replace(TRANSLATE(@strAlphaNumeric, 'abcdefghijklmnopqrstuvwxyz /\,;', replicate('a', 27)+replicate(';',4)),'a','')
set @i=charindex(';',@strAlphaNumeric)
if @i>0
   set @strAlphaNumeric = left(@strAlphaNumeric,@i-1)
RETURN replace(LTRIM(replace(ISNULL(@strAlphaNumeric,0), '0', ' ')), ' ', '0')
END

答案4

得分: 1

One method would be to use PATINDEX to find the position of the first non-alphanumeric or space character, and get the LEFT most characters up to that point.

Then I use a helper function I wrote which replaces characters that fit a certain pattern with another character. As you are on SQL Server 2016, then you'll need to use the XML version of it, as STRING_AGG was added in 2017. (note that the helper function also uses a tally helper function, as GENERATE_SERIES was added in 2022).

Finally I convert the value to a bigint to trim leading zeroes.

CREATE FUNCTION [fn].[Tally] (@LastNumber bigint, @Zero bit) 
RETURNS table
AS RETURN
    --Copyright Thom A (https://larnu.uk) under CC-by-SA
    WITH N AS(
        SELECT N
        FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
    Tally AS(
        SELECT 0 AS I
        WHERE @Zero = 0
          AND @LastNumber IS NOT NULL
        UNION ALL
        SELECT TOP (ISNULL(@LastNumber,0))
               ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
        FROM N N1, N N2, N N3, N N4, N N5, N N6, N N7) --Up to 10,000,000 rows
    SELECT I
    FROM Tally T;
GO

CREATE FUNCTION [fn].[PatternCharacterReplace_XML] (@String varchar(8000), @Pattern varchar(100), @ReplacementCharacter varchar(1)) 
RETURNS table
AS RETURN
    --Copyright Thom A (https://larnu.uk) under CC-by-SA
    SELECT (SELECT CASE WHEN V.C LIKE @Pattern THEN @ReplacementCharacter ELSE V.C END
            FROM fn.Tally(CONVERT(int,LEN(@String)),1) T
                 CROSS APPLY (VALUES(SUBSTRING(@String,T.I,1)))V(C)
            ORDER BY T.I
            FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(8000)') AS ReplacedString;
GO

SELECT V.YourString,
       PI.I,
       LEFT(V.YourString,PI.I),
       TRY_CONVERT(bigint,PCRX.ReplacedString)
FROM (VALUES('051 765 999'),
            ('MK00564123'),
            ('KJ786453425/9'),
            ('432563542/3'),
            ('096745632'),
            ('53452BMG441,8'))V(YourString)
     CROSS APPLY(VALUES(ISNULL(NULLIF(PATINDEX('%[^0-9 A-Z]%',V.YourString),0)-1,LEN(V.YourString))))PI(I)
     CROSS APPLY fn.PatternCharacterReplace_XML(LEFT(V.YourString,PI(I)), '[^0-9]', '') PCRX;

This'll certainly work on versions as old as SQL Server 2012 for strings where the numbers to return are 18 numbers or less.

If the numerical value would be greater than 9,223,372,036,854,775,807 then you could use PATINDEX and STUFF to remove the leading 0 characters.

英文:

One method would be to use PATINDEX to find the position of the first non-alphanumeric or space character, and get the LEFT most characters up to that point.

Then I use a helper function I wrote which replaces characters that fit a certain pattern with another character. As you are on SQL Server 2016, then you'll need to use the XML version of it, as STRING_AGG was added in 2017. (note that the helper function also uses a tally helper function, as GENERATE_SERIES was added in 2022).

Finally I convert the value to a bigint to trim leading zeroes.

CREATE FUNCTION [fn].[Tally] (@LastNumber bigint, @Zero bit) 
RETURNS table
AS RETURN
    --Copyright Thom A (https://larnu.uk) under CC-by-SA
    WITH N AS(
        SELECT N
        FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL),(NULL))N(N)),
    Tally AS(
        SELECT 0 AS I
        WHERE @Zero = 0
          AND @LastNumber IS NOT NULL
        UNION ALL
        SELECT TOP (ISNULL(@LastNumber,0))
               ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
        FROM N N1, N N2, N N3, N N4, N N5, N N6, N N7) --Up to 10,000,000 rows
    SELECT I
    FROM Tally T;
GO

CREATE FUNCTION [fn].[PatternCharacterReplace_XML] (@String varchar(8000), @Pattern varchar(100), @ReplacementCharacter varchar(1)) 
RETURNS table
AS RETURN
    --Copyright Thom A (https://larnu.uk) under CC-by-SA
    SELECT (SELECT CASE WHEN V.C LIKE @Pattern THEN @ReplacementCharacter ELSE V.C END
            FROM fn.Tally(CONVERT(int,LEN(@String)),1) T
                 CROSS APPLY (VALUES(SUBSTRING(@String,T.I,1)))V(C)
            ORDER BY T.I
            FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(8000)') AS ReplacedString;
GO
SELECT V.YourString,
       PI.I,
       LEFT(V.YourString,PI.I),
       TRY_CONVERT(bigint,PCRX.ReplacedString)
FROM (VALUES('051 765 999'),
            ('MK00564123'),
            ('KJ786453425/9'),
            ('432563542/3'),
            ('096745632'),
            ('53452BMG441,8'))V(YourString)
     CROSS APPLY(VALUES(ISNULL(NULLIF(PATINDEX('%[^0-9 A-Z]%',V.YourString),0)-1,LEN(V.YourString))))PI(I)
     CROSS APPLY fn.PatternCharacterReplace_XML(LEFT(V.YourString,PI.I),'[^0-9]','') PCRX;

This'll certainly work on versions as old as SQL Server 2012 for strings where the numbers to return are 18 numbers or less.

If the numerical value would be greater than 9,223,372,036,854,775,807 then you could use PATINDEX and STUFF to remove the leading 0 characters.

答案5

得分: 1

以下是您提供的内容的翻译:

MS SQL Server 2017 Schema Setup:

CREATE FUNCTION dbo.udf_GetNumeric
(@strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE @intAlpha INT
DECLARE @intSpecial INT
SET @intSpecial = PATINDEX('%[.,+*?^$(){}\\/]%', @strAlphanumeric)
IF @intSpecial > 0
BEGIN
   SET @strAlphaNumeric = SUBSTRING(@strAlphaNumeric,1,@intSpecial)
END

SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric)
BEGIN
WHILE @intAlpha > 0
BEGIN
SET @strAlphaNumeric = STUFF(@strAlphaNumeric, @intAlpha, 1, '' )
SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric )
END
END
RETURN ISNULL(REPLACE(LTRIM(REPLACE(
@strAlphaNumeric
  , '0', ' ')),' ', '0'),0)
END
;

Query 1:

create table temp
(
	alphanumeric varchar(100)
)

Query 2:

insert into temp
(alphanumeric)
values
('051 765 999'),
('MK00564123'),
('KJ786453425/9'),
('432563542/3'),
('096745632'),
('53452BMG441,8')

Query 3:

SELECT alphanumeric,dbo.udf_GetNumeric(alphanumeric) 
from temp

Results:

|  alphanumeric |           |
|---------------|-----------|
|   051 765 999 |  51765999 |
|    MK00564123 |    564123 |
| KJ786453425/9 | 786453425 |
|   432563542/3 | 432563542 |
|     096745632 |  96745632 |
| 53452BMG441,8 |  53452441 |

希望这有助于您理解代码和查询的内容。

英文:

SQL Fiddle

MS SQL Server 2017 Schema Setup:

CREATE FUNCTION dbo.udf_GetNumeric
(@strAlphaNumeric VARCHAR(256))
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE @intAlpha INT
DECLARE @intSpecial INT
SET @intSpecial = PATINDEX('%[.,+*?^$(){}\\/]%', @strAlphanumeric)
IF @intSpecial > 0
BEGIN
SET @strAlphaNumeric = SUBSTRING(@strAlphaNumeric,1,@intSpecial)
END
SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric)
BEGIN
WHILE @intAlpha > 0
BEGIN
SET @strAlphaNumeric = STUFF(@strAlphaNumeric, @intAlpha, 1, '' )
SET @intAlpha = PATINDEX('%[^0-9]%', @strAlphaNumeric )
END
END
RETURN ISNULL(REPLACE(LTRIM(REPLACE(
@strAlphaNumeric
, '0', ' ')),' ', '0'),0)
END
;

Query 1:

create table temp
(
alphanumeric varchar(100)
)

Results:

Query 2:

insert into temp
(alphanumeric)
values
('051 765 999'),
('MK00564123'),
('KJ786453425/9'),
('432563542/3'),
('096745632'),
('53452BMG441,8')

Results:

Query 3:

SELECT alphanumeric,dbo.udf_GetNumeric(alphanumeric) 
from temp

Results:

|  alphanumeric |           |
|---------------|-----------|
|   051 765 999 |  51765999 |
|    MK00564123 |    564123 |
| KJ786453425/9 | 786453425 |
|   432563542/3 | 432563542 |
|     096745632 |  96745632 |
| 53452BMG441,8 |  53452441 |

答案6

得分: 1

以下是您要翻译的代码部分:

在这里,您可以使用这个使用`PATINDEX`的函数。

CREATE FUNCTION dbo.RemoveAlphabets
(
    @input VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
BEGIN

    DECLARE
    @match_pattern VARCHAR(MAX) = '%[a-z ]%',
    @stop_pattern VARCHAR(MAX) = '%[.,/]%',
    @replace_value VARCHAR(MAX) = '',
    @match_ix INT,
    @input_copy VARCHAR(MAX)

    SET @input_copy = @input

    SET @match_ix = PATINDEX(@stop_pattern, @input_copy);
    IF @match_ix > 0
    BEGIN
        SET @input_copy = SUBSTRING(@input_copy, 1, @match_ix - 1)
    END

    SET @match_ix = PATINDEX(@match_pattern, @input_copy);

    WHILE @match_ix > 0
    BEGIN
        SET @input_copy = REPLACE(@input_copy, SUBSTRING(@input_copy, @match_ix, 1), @replace_value);
        SET @match_ix = PATINDEX(@match_pattern, @input_copy);
    END

    RETURN @input_copy;

END

并且这是一个测试。

SELECT dbo.RemoveAlphabets(col1)
FROM (
    VALUES (N'051 765 999')
    , (N'MK00564123')
    , (N'KJ786453425/9')
    , (N'432563542/3')
    , (N'096745632')
    , (N'53452BMG441,8')
) t (col1)
英文:

Here You can use this function which uses PATINDEX.

CREATE FUNCTION dbo.RemoveAlphabets
(
@input          VARCHAR(MAX)
)
RETURNS VARCHAR(MAX)
BEGIN
declare
@match_pattern  VARCHAR(MAX) = '%[a-z ]%',
@stop_pattern  VARCHAR(MAX) = '%[.,/]%',
@replace_value  VARCHAR(MAX) = '',
@match_ix   INT,
@input_copy          VARCHAR(MAX)
SET @input_copy = @input
SET @match_ix = PATINDEX(@stop_pattern, @input_copy);
IF @match_ix > 0
BEGIN
SET @input_copy = SUBSTRING(@input_copy, 1, @match_ix - 1)
END
SET @match_ix = PATINDEX(@match_pattern, @input_copy);
WHILE @match_ix > 0
BEGIN
SET @input_copy = REPLACE(@input_copy, SUBSTRING(@input_copy, @match_ix, 1), @replace_value);
SET @match_ix = PATINDEX(@match_pattern, @input_copy);
END
RETURN @input_copy;
END

And this is a test.

select dbo.RemoveAlphabets(col1)
from (
VALUES  (N'051 765 999')
,   (N'MK00564123')
,   (N'KJ786453425/9')
,   (N'432563542/3')
,   (N'096745632')
,   (N'53452BMG441,8')
) t (col1)

huangapple
  • 本文由 发表于 2023年6月1日 01:06:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/76375854.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定