2023年5月17日 17:21:11go评论70阅读模式

英文:

Regex not in group

问题

You can modify your regex pattern to capture the FTX text while ignoring the ?' in the text by using a negative lookahead assertion. Here's the modified regex pattern:

FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)(?=\?&#39;|$)

This pattern uses (?=\?'|$) as a lookahead assertion, which ensures that the regex will match until it encounters either ?'' or the end of the string ($). This way, it will ignore the ?'' and capture the desired text.

In your provided EDI text, this regex should correctly capture the FTX text even when it contains ?''.

英文:

Iam trying to get some text from a EDI file, with Regex. I got this string/text:

UNA:+.? &#39;
UNB+UNOC:3+5790000120420:14+5790000181872:14+991111:1850+KuvertNr1234&#39;
UNH+BrevNr5678+CONTRL:D:93A:ZZ:C0230Q+CTL02&#39;
UCI+MEDREF01095+5790000181872:14+5790000120420:14+4&#39;
UCM+1111MAN01095+MEDREF:D:93A:UN:H0130R+4&#39;
FTX+VER+P00++EDI-brev med nummeret 1111MAN01095, afsendt 11/11 1999 kl 18.46 har \:ikke kunnet modtages. Horsens Sygehus, laboratoriet kan ikke modtage \:sygehushenvisninger. :Med venlig hilsen: IT-Hotline. Horsens Sygehus.Telefon 12345678.&#39;UNT+5+BrevNr5678&#39;UNZ+1+KuvertNr1234&#39;

And i need the FTX text. And i got this regex for it: FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)'

But in the Edifact, ? escapes ', so if i add ?' to the text

UNA:+.? &#39;
UNB+UNOC:3+5790000120420:14+5790000181872:14+991111:1850+KuvertNr1234&#39;
UNH+BrevNr5678+CONTRL:D:93A:ZZ:C0230Q+CTL02&#39;
UCI+MEDREF01095+5790000181872:14+5790000120420:14+4&#39;
UCM+1111MAN01095+MEDREF:D:93A:UN:H0130R+4&#39;
FTX+VER+P00++EDI-brev med nummeret 1111MAN01095, afsendt 11/11 1999 kl 18.46 har \:ikke kunnet modtages. Horsens Sygehus, laboratoriet kan ikke modtage \:sygehushenvisninger. :Med venlig hilsen: IT-Hotline.?&#39; Horsens Sygehus.Telefon 12345678.&#39;UNT+5+BrevNr5678&#39;UNZ+1+KuvertNr1234&#39;

My regex stops at the ' char right after ?. How can i use the .*? but ignore the "?'" in the text?
The Edifact can either be with \n or as a long string without \n

Tried with: FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)'

答案1

得分: 1

你可以使用以下正则表达式：

FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)&#39;(?&lt;!\?&#39;)

查看正则表达式演示。

详细信息：

FTX\+ - FTX+ 字符串
([a-zA-Z]{2,4}) - 第一组：两到四个ASCII字母数字字符
\+ - 一个 + 字符
([a-zA-Z0-9]{3}) - 第二组：三个ASCII字母数字字符
\+\+ - 一个 ++ 字符串
(.*?) - 第三组：除换行符之外的任意零个或多个字符，尽可能少
'(?<!\?') - 一个 ' 字符，不在 ? 字符之前。

英文:

You can use

FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)&#39;(?&lt;!\?&#39;)

See the regex demo.

Details:

FTX\+ - FTX+ string
([a-zA-Z]{2,4}) - Group 1: two to four ASCII alphanumeric chars
\+ - a + char
([a-zA-Z0-9]{3}) - Group 2: three ASCII alphanumeric chars
\+\+ - a ++ string
(.*?) - Group 3: any zero or more chars other than a newline char, as few as possible
'(?<!\?') - a ' char that is not preceded with a ? char.

答案2

得分: 1

以下是您提供的代码部分的中文翻译：

另一种选项可能是使用否定字符类来排除匹配的 '，并且只在它直接前面有问号的情况下匹配它：

FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+([^&#39;]*(?:&#39;(?&lt;=\?.)[^&#39;]*)*)&#39;

最后一部分 ([^']*(?:'(?<=\?.)[^']*)*) 匹配：

( 捕获组
- [^']* 匹配除了 ' 之外的可选字符
- (?: 非捕获组，作为整体重复
  - '(?<=\?.) 匹配 ' 并使用正向后瞻来断言它前面有一个 ?
  - [^']* 匹配除了 ' 之外的可选字符
- )* 关闭非捕获组并可选重复
) 关闭捕获组
' 字面匹配

正则表达式演示

英文:

Another option could be to exclude matching ' using a negated character class, and only match it when it is directly preceded by a question mark:

FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+([^&#39;]*(?:&#39;(?&lt;=\?.)[^&#39;]*)*)&#39;

The last part ([^']*(?:'(?<=\?.)[^']*)*) matches:

( Capture group
- [^']* Match optional chars other than '
- (?: Non capture group to repeat as a whole part
  - '(?<=\?.) Match ' and assert ? before it using a positive lookbehind
  - [^']* Match optional chars other than '
- )* Close the non capture group and optionally repeat
) Close the capture group
' Match literally

Regex demo

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

“Regex not in group” 可翻译为 “正则表达式不在分组中”。

问题

答案1

答案2

问题在 macOS 上运行 asp.net core 项目时发生。

如何在设置值时“反转”数组，以使获取/设置更加方便？

C#和Python在模式注册表上使用相同架构的集成失败。

split with same pattern structure (!)|(!\?) and (!)|(e!) But behave differently in python regex

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论