2023年5月10日 15:46:54go评论57阅读模式

英文:

I need to create a regular expression where 2 groups point to the same pattern

问题

以下是你要翻译的内容：

我有4个文件名

    ABC_ALL_20230508T050011.zip
    ABC_Intra1_20230508T050011.zip
    ABC_Intra2_20230508T050011.zip
    ABC_INT_20230508T050011.zip

我正在尝试创建一个正则表达式，用于捕获ExtractId（提取标识）、FileName（文件名）和Date（日期）。ExtractId和FileName需要在中间位置捕获相同的值，例如'ALL'、'Intra1'、'Intra2'或'INT'。

我目前有：

    (?&lt;ExtractId&gt;[a-zA-Z]{3})_(?&lt;FileName&gt;[a-zA-Z0-9]*)_(?&lt;FileDate&gt;[0-9A-Z]{8}).*

并且结果是：

    ExtractId = ABC
    匹配
    FileName = ALL
    匹配
    FileDate = 20230508

我想要的是：

    ExtractId = ALL
    匹配
    FileName = ALL
    匹配
    FileDate = 20230508

我相信可以通过使用正则表达式的子表达式来实现这一点，其中可以让两个组指向相同的位置，但我以前从未使用过。

谢谢

英文:

I have 4 file names

ABC_ALL_20230508T050011.zip
ABC_Intra1_20230508T050011.zip
ABC_Intra2_20230508T050011.zip
ABC_INT_20230508T050011.zip

I am trying to create a regex that captures ExtractId, FileName and Date. The ExtractId and FileName need to capture the same values in the middle position eg 'ALL', 'Intra1', 'Intra2' or 'INT'

I currently have:

(?&lt;ExtractId&gt;[a-zA-Z]{3})_(?&lt;FileName&gt;[a-zA-Z0-9]*)_(?&lt;FileDate&gt;[0-9A-Z]{8}).*

and results to:

ExtractId = ABC
Matched
FileName = ALL
Matched
FileDate = 20230508

I am after:

ExtractId = ALL
Matched
FileName = ALL
Matched
FileDate = 20230508

I believe there is a way of achieving this using regex sub expression where you can have 2 groups point to the same position but I have never use it before.

Thanks

答案1

得分: 1

以下是您要翻译的内容：

您可以在先行断言中使用命名捕获组：

（？= [a-zA-Z] {3} （？<ExtractId> [a-zA-Z0-9] +））[a-zA-Z] {3} （？<FileName> [a-zA-Z0-9] +）（？<FileDate> [0-9A-Z] {8}）.*

解释

（？= 积极的先行断言
- [a-zA-Z] {3} _ 匹配3次 a-zA-Z，然后匹配 _
- （？<ExtractId> [a-zA-Z0-9] +）_ 命名组 Extractid，捕获1个以上的字符 a-zA-Z0-9，然后匹配 _
） 关闭先行断言
[a-zA-Z] {3} _ 匹配3次 a-zA-Z，然后匹配 _
（？<FileName> [a-zA-Z0-9] +） 命名组 FileName，捕获1个以上的字符 a-zA-Z0-9
_ 字面匹配
（？<FileDate> [0-9A-Z] {8}） 命名组 FileDate，捕获8个字符 0-9A-Z
.* 匹配字符串的其余部分（如果您需要的话，否则可以省略此部分）

查看正则表达式演示。

如果您想将字符串锚定到开头，可以添加 ^，如下所示：

^（？=[a-zA-Z] {3} （？<ExtractId> [a-zA-Z0-9] +））[a-zA-Z] {3} （？<FileName> [a-zA-Z0-9] +）（？<FileDate> [0-9A-Z] {8}）.*

英文:

You can use a named capture group in a lookahead assertion:

(?=[a-zA-Z]{3}_(?&lt;ExtractId&gt;[a-zA-Z0-9]+)_)[a-zA-Z]{3}_(?&lt;FileName&gt;[a-zA-Z0-9]+)_(?&lt;FileDate&gt;[0-9A-Z]{8}).*

Explanation

(?= Positive lookahead assertion
- [a-zA-Z]{3}_ Match 3 times a-zA-Z and then match _
- (?<ExtractId>[a-zA-Z0-9]+)_ Named group Extractid, capture 1+ chars a-zA-Z0-9 and then match _
) Close the lookahead
[a-zA-Z]{3}_ Match 3 times a-zA-Z and then match _
(?<FileName>[a-zA-Z0-9]+) Named group FileName, capture 1+ chars a-zA-Z0-9
_ Match literally
(?<FileDate>[0-9A-Z]{8}) Named group FileDate, capture 8 chars 0-9A-Z
.* Match the rest of the string (if you need that, else you can omit this part)

See a regex demo.

If you want to anchor the strings to the start, you can prepend ^ like:

^(?=[a-zA-Z]{3}_(?&lt;ExtractId&gt;[a-zA-Z0-9]+)_)[a-zA-Z]{3}_(?&lt;FileName&gt;[a-zA-Z0-9]+)_(?&lt;FileDate&gt;[0-9A-Z]{8}).*

答案2

得分: 0

将重叠捕获组中的前一个部分放入一个前瞻断言中，以便为最后一个重叠捕获组留出匹配的缓冲区：

_(?=(?&lt;ExtractId&gt;[^_]+))(?&lt;FileName&gt;[^_]+)_(?&lt;FileDate&gt;[0-9A-Z]{8})

演示：https://regex101.com/r/YhFyIn/2

英文:

You can put the former of the overlapping capture groups in a lookahead assertion to leave the buffer for the last overlapping capture group to match:

_(?=(?&lt;ExtractId&gt;[^_]+))(?&lt;FileName&gt;[^_]+)_(?&lt;FileDate&gt;[0-9A-Z]{8})

Demo: https://regex101.com/r/YhFyIn/2

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

需要创建一个正则表达式，其中两个组指向相同的模式。

问题

答案1

答案2

日期正则表达式，排除问题

字符串中需要多个匹配项的.replace()替换函数的替代方案。

移除数据框中行的特殊字符。

Java正则表达式匹配精确的社会安全号码模式

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论