2023年6月15日 03:50:39go评论89阅读模式

英文:

Sql regexp that accepts chinese character,ascii & rejects special characters

问题

需要一个满足以下条件的 SQL 正则表达式模式。

接受中文字符，
接受 - a-z、A-Z、0-9 和空格，
拒绝特殊字符。

我已经尝试了以下内容。

Select regexp_like((val)::TEXT , ('^[ !-’~¡-ÿ]*$')::TEXT)

或

regexp_like((val)::TEXT ,('^[^[:ascii:]]+$')::text);

上述查询也接受了不应该接受的特殊字符。

SELECT (('#$')::TEXT ~ ('^[a-zA-Z0-9]*$'));

这个查询满足条件，但无法接受中文字符。

英文:

I need a sql regexp pattern that satisfied the following criteria.

1.Accepts chinese characters,
2.Accepts - a-z,A-Z,0-9,spaces
3.Rejects only special characters.

I've tried the following.

  Select regexp_like((val)::TEXT , (&#39;^[ !-’~&#161;-&#255;]*$&#39;)::TEXT)
      Or

regexp_like((val)::TEXT ,('^[^[:ascii:]]+$')::text);

The above query also accepts special characters it should not be.

SELECT (('#$$')::TEXT ~ ('^[a-zA-Z0-9]*$'));

This query satisfied but fails to accept chinese character.

答案1

得分: 0

你可以使用中文字符的Unicode值

SELECT (('#$$')::TEXT ~ ('^[a-zA-Z0-9\x4e00-\x9fff\x3400-\x4dbf]*$'));

英文:

You can use the unicode values of the chinese characters

SELECT ((&#39;#$$&#39;)::TEXT ~ (&#39;^[a-zA-Z0-9\x4e00-\x9fff\x3400-\x4dbf]*$&#39;));

答案2

得分: 0

根据 Wikipedia，中文字符位于以下Unicode范围内，从U+4E00到U+9FFF。维基百科 – CJK统一表意文字。

另外，还有 扩展A到H。
CJK统一表意文字扩展A，U+3400到U+4DBF。
CJK统一表意文字扩展B，U+20000到U+2A6DF。
CJK统一表意文字扩展C，U+2A700到U+2B73F。
CJK统一表意文字扩展D，U+2B740到U+2B81F。
CJK统一表意文字扩展E，U+2B820到U+2CEAF。
CJK统一表意文字扩展F，U+2CEB0到U+2EBEF。
CJK统一表意文字扩展G，U+30000到U+3134F。
CJK统一表意文字扩展H，U+31350到U+323AF。

因此，您可以将Unicode范围添加到您的字符类，如下所示。

> "1.接受中文字符"

[\u4e00-\u9fff]

> "2.接受 - a-z，A-Z，0-9，空格"

(?i)，将切换为 不区分大小写 模式。

(?i)[a-z\d \u4e00-\u9fff]

> "3.仅拒绝特殊字符"

我想您提供的值是您希望拒绝的字符。
对于提供的范围，从_!到’，您希望跳过数字字符_0到9_和大写字母_从A到Z。

因此，需要更改为以下内容。

[^!-/:-@\[-`~&#161;-&#255;]

然后，您可以使用 字符类交集 语法 && 将此字符类添加到先前的字符类中。

因此，完整的模式将如下所示。

(?i)^[a-z\d \u4e00-\u9fff&&[^!-/:-@\[-`~&#161;-&#255;]]*$

英文:

According to Wikipedia, the Chinese characters are within the following Unicode range, U+4E00, through U+9FFF.  Wikipedia – CJK Unified Ideographs.

Additionally, there are Extensions A through H.
CJK Unified Ideographs Extension A, U+3400 through U+4DBF.
CJK Unified Ideographs Extension B, U+20000 through U+2A6DF.
CJK Unified Ideographs Extension C, U+2A700 through U+2B73F.
CJK Unified Ideographs Extension D, U+2B740 though U+2B81F.
CJK Unified Ideographs Extension E, U+2B820 though U+2CEAF.
CJK Unified Ideographs Extension F, U+2CEB0 through U+2EBEF.
CJK Unified Ideographs Extension G, U+30000 through U+3134F.
CJK Unified Ideographs Extension H, U+31350 through U+323AF.

So, you can add a Unicode range to your character class, as follows.

> "1.Accepts chinese characters"

[\u4e00-\u9fff]

> "2.Accepts - a-z,A-Z,0-9,spaces"

The (?i), will toggle-on case-insensitive mode.

(?i)[a-z\d \u4e00-\u9fff]

> "3.Rejects only special characters."

I imagine the values you provided, are the characters you wish to reject.
For the provided range, ! through ’, you want to skip over the digit characters, 0 through 9, and the uppercase letters, A through Z.

So, that will need to be changed to the following.

[^!-/:-@\[-`~&#161;-&#255;]

You can then add this character class to the previous, using the character class intersection syntax, &&.

So, the complete pattern would be the following.

(?i)^[a-z\d \u4e00-\u9fff&amp;&amp;[^!-/:-@\[-`~&#161;-&#255;]]*$

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Sql正则表达式可以接受中文字符、ASCII字符，同时拒绝特殊字符。

问题

答案1

答案2

I am working on an inventory program (web based) for my local gameshop. Trying to create an update page. Values are not carrying over

如何使用用户名和密码打开数据库连接

可点击的子字符串在SwiftUI中

正则表达式未匹配电话号码示例中的一个案例。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。