英文:
Regex not in group
问题
You can modify your regex pattern to capture the FTX
text while ignoring the ?'
in the text by using a negative lookahead assertion. Here's the modified regex pattern:
FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)(?=\?'|$)
This pattern uses (?=\?'|$)
as a lookahead assertion, which ensures that the regex will match until it encounters either ?''
or the end of the string ($
). This way, it will ignore the ?''
and capture the desired text.
In your provided EDI text, this regex should correctly capture the FTX
text even when it contains ?''
.
英文:
Iam trying to get some text from a EDI file, with Regex. I got this string/text:
UNA:+.? '
UNB+UNOC:3+5790000120420:14+5790000181872:14+991111:1850+KuvertNr1234'
UNH+BrevNr5678+CONTRL:D:93A:ZZ:C0230Q+CTL02'
UCI+MEDREF01095+5790000181872:14+5790000120420:14+4'
UCM+1111MAN01095+MEDREF:D:93A:UN:H0130R+4'
FTX+VER+P00++EDI-brev med nummeret 1111MAN01095, afsendt 11/11 1999 kl 18.46 har \:ikke kunnet modtages. Horsens Sygehus, laboratoriet kan ikke modtage \:sygehushenvisninger. :Med venlig hilsen: IT-Hotline. Horsens Sygehus.Telefon 12345678.'UNT+5+BrevNr5678'UNZ+1+KuvertNr1234'
And i need the FTX
text. And i got this regex for it: FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)'
But in the Edifact, ? escapes ', so if i add ?' to the text
UNA:+.? '
UNB+UNOC:3+5790000120420:14+5790000181872:14+991111:1850+KuvertNr1234'
UNH+BrevNr5678+CONTRL:D:93A:ZZ:C0230Q+CTL02'
UCI+MEDREF01095+5790000181872:14+5790000120420:14+4'
UCM+1111MAN01095+MEDREF:D:93A:UN:H0130R+4'
FTX+VER+P00++EDI-brev med nummeret 1111MAN01095, afsendt 11/11 1999 kl 18.46 har \:ikke kunnet modtages. Horsens Sygehus, laboratoriet kan ikke modtage \:sygehushenvisninger. :Med venlig hilsen: IT-Hotline.?' Horsens Sygehus.Telefon 12345678.'UNT+5+BrevNr5678'UNZ+1+KuvertNr1234'
My regex stops at the ' char right after ?. How can i use the .*? but ignore the "?'" in the text?
The Edifact can either be with \n or as a long string without \n
Tried with: FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)'
答案1
得分: 1
你可以使用以下正则表达式:
FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)'(?<!\?')
查看正则表达式演示。
详细信息:
FTX\+
-FTX+
字符串([a-zA-Z]{2,4})
- 第一组:两到四个ASCII字母数字字符\+
- 一个+
字符([a-zA-Z0-9]{3})
- 第二组:三个ASCII字母数字字符\+\+
- 一个++
字符串(.*?)
- 第三组:除换行符之外的任意零个或多个字符,尽可能少'(?<!\?')
- 一个'
字符,不在?
字符之前。
英文:
You can use
FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+(.*?)'(?<!\?')
See the regex demo.
Details:
FTX\+
-FTX+
string([a-zA-Z]{2,4})
- Group 1: two to four ASCII alphanumeric chars\+
- a+
char([a-zA-Z0-9]{3})
- Group 2: three ASCII alphanumeric chars\+\+
- a++
string(.*?)
- Group 3: any zero or more chars other than a newline char, as few as possible'(?<!\?')
- a'
char that is not preceded with a?
char.
答案2
得分: 1
以下是您提供的代码部分的中文翻译:
另一种选项可能是使用否定字符类来排除匹配的 '
,并且只在它直接前面有问号的情况下匹配它:
FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+([^']*(?:'(?<=\?.)[^']*)*)'
最后一部分 ([^']*(?:'(?<=\?.)[^']*)*)
匹配:
(
捕获组[^']*
匹配除了'
之外的可选字符(?:
非捕获组,作为整体重复'(?<=\?.)
匹配'
并使用正向后瞻来断言它前面有一个?
[^']*
匹配除了'
之外的可选字符
)*
关闭非捕获组并可选重复
)
关闭捕获组'
字面匹配
英文:
Another option could be to exclude matching '
using a negated character class, and only match it when it is directly preceded by a question mark:
FTX\+([a-zA-Z]{2,4})\+([a-zA-Z0-9]{3})\+\+([^']*(?:'(?<=\?.)[^']*)*)'
The last part ([^']*(?:'(?<=\?.)[^']*)*)
matches:
(
Capture group[^']*
Match optional chars other than'
(?:
Non capture group to repeat as a whole part'(?<=\?.)
Match'
and assert?
before it using a positive lookbehind[^']*
Match optional chars other than'
)*
Close the non capture group and optionally repeat
)
Close the capture group'
Match literally
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论