2021年11月2日 07:04:53go评论91阅读模式

英文:

Switching multi-language substring positions using regex

问题

原始输入带有立陶宛字母：

Ą.BČ
Ą.BČ D Ę
Ą. BČ
Ą. BČ D Ę
Ą BČ
Ą BČ D Ę
下面的示例不应受影响。
ĄB ČD DĘ

期望的结果：

BČ Ą.
BČ Ą. D Ę
BČ Ą. 
BČ Ą. D Ę
BČ Ą 
BČ Ą D Ę
ĄB ČD DĘ

我尝试过的方法：

^(.\.? *)([\p{L}\p{N}\p{M}]*)$
使用ReplaceAllString替换，如下所示
$2 $1

我尝试了各种模式，但这是我目前能想到的最好的。
它成功地捕获了第1、第3和第5行，并成功地进行了替换：
（除了一些行末多余的空格）

BČ Ą.
Ą.BČ D Ę
BČ Ą. 
Ą. BČ D Ę
BČ Ą 
Ą BČ D Ę
ĄB ČD DĘ

解释：

有一组数据，其中包含不同的条目，基本结构为[名字的首字母][姓氏]，我希望将其理想地转换为[姓氏][空格][名字的首字母][句点]?

正则表达式101的链接：
regex101

最终解决方案：

^([\p{L}\p{N}\p{M}](?:\. *| +))([\p{L}\p{N}\p{M}]+)
使用ReplaceAllString替换，如下所示
$2 $1

英文:

Raw input with lithuanian letters:

Ą.BČ
Ą.BČ D Ę
Ą. BČ
Ą. BČ D Ę
Ą BČ
Ą BČ D Ę
Examples below should not be affected.
ĄB ČD DĘ

Expected result:

BČ Ą.
BČ Ą. D Ę
BČ Ą. 
BČ Ą. D Ę
BČ Ą 
BČ Ą D Ę
ĄB ČD DĘ

What I've tried:

^(.\.? *)([\p{L}\p{N}\p{M}]*)$
With ReplaceAllString substitution like so
$2 $1

I have tried various patterns but this is the best I could come up for now.
It manages to capture 1st, 3rd and 5th line and successfully substitute like so:
(Except for some extra spaces at the end of lines)

BČ Ą.
Ą.BČ D Ę
BČ Ą. 
Ą. BČ D Ę
BČ Ą 
Ą BČ D Ę
ĄB ČD DĘ

Explanation:

> There is a set of data with varying entries of the underlying basic
> structure [FIRST NAME FIRST LETTER][LASTNAME] which I want to ideally
> bring to [LASTNAME][SPACE][FIRST NAME FIRST LETTER][DOT]?

Link to regex101:
regex101

Final solution:

^([\p{L}\p{N}\p{M}](?:\. *| +))([\p{L}\p{N}\p{M}]+)
    With ReplaceAllString substitution like so
    $2 $1

答案1

得分: 1

对于你的示例数据，你可以省略锚点$，并匹配一个点后面可选的空格，或者1个或多个空格。

为了防止字符类的空匹配，你可以使用+而不是*来重复匹配1个或多个次数。

^(.(?:\. *| +))([\p{L}\p{N}\p{M}]+)

在正则表达式演示中查看。

注意，.可以匹配任何字符，包括空格。你也可以将点改为单个[\p{L}\p{N}\p{M}]。

英文:

For your example data, you can omit the anchor $ and match either a dot followed by optional spaces, or 1 or more spaces.

To prevent an empty match for the character class, you can repeat it 1 or more times using + instead of *

^(.(?:\. *| +))([\p{L}\p{N}\p{M}]+)

See a regex demo

Note that the . can match any char including a space. You might also change the dot to a single [\p{L}\p{N}\p{M}]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用正则表达式切换多语言子字符串的位置

问题

答案1

How to Parse HTTP.GET response in Golang

如何使用Golang使用ECDSA私钥对消息进行签名？

Golang解析ProtoBuf

io.Copy() erase the Reader content

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论