使用正则表达式的反向断言如何仅匹配一次?

huangapple go评论44阅读模式
英文:

How do I match only once using a regex lookbehind?

问题

以下是您要翻译的内容:

我想构建一个正则表达式,只匹配直接在条件之前的内容,就像这样:

Question: What is the capital city of France?
A. Berlin
B. Paris
C. Rome
D. Madrid

Key: B

Question: Who is credited with inventing the World Wide Web?
A. Steve Jobs
B. Bill Gates
C. Tim Berners-Lee
D. Mark Zuckerberg

Key: C

我想匹配的是:

A. Berlin
C. Rome
D. Madrid

A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg

Key 是 A -> 匹配: B, C, D。

Key 是 B -> 匹配: A, C, D。

Key 是 C -> 匹配: A, B, D。

Key 是 D -> 匹配: A, B, C。

这是我为 Key 是 C 时的正则表达式:

(?!Key: C)^[ABD].*

但它会匹配:

A. Berlin
B. Paris
D. Madrid

A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg

是否有人能提供解决方案或指导如何解决这个问题?

英文:

I want to build a regex that only matches stuff directly before a condition, like this:

Question: What is the capital city of France?
A. Berlin
B. Paris
C. Rome
D. Madrid

Key: B

Question: Who is credited with inventing the World Wide Web?
A. Steve Jobs
B. Bill Gates
C. Tim Berners-Lee
D. Mark Zuckerberg

Key: C

I want to match:

A. Berlin
C. Rome
D. Madrid

A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg

Key is A -> match: B, C, D.

Key is B -> match: A, C, D.

Key is C -> match: A, B, D.

Key is D -> match: A, B, C.

This is my regex for when the key is C:

(?<!Key: C)^[ABD].*

But it will match:

A. Berlin
B. Paris
D. Madrid

A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg

Can anyone suggest a solution or offer guidance on how to troubleshoot this problem?

答案1

得分: 3

你可以使用前瞻来断言答案后面不跟着一个具有相同键的 Key: 行:

^                           # 匹配以行的开头开始的答案,
(?<key>[ABCD])              # 然后是一个键,我们捕获它,
\.\s+.+                     # 一个句点,一些空格和直到行末的所有内容,
(?=                         # 后面跟着
  (?:\n[ABCD]\.\s+.+){0,3}  # 0到3个更多的答案,然后
  \n                        # 一个空行,
  \nKey:\s+                 # 然后是 'Key:' 后跟一些空格和
  (?!\\k<key>)               # 不与我们捕获的键相同的东西。
)

相同的正则表达式也可以在Python中使用,语法上有一些小差异:

^(?P<key>[ABCD])\.\s+.+
(?=
  (?:\n[ABCD]\.\s+.+){0,3}
  \n
  \nKey:\s+(?!(?P=key))
)

在 regex101.com 上尝试它:PCRE/PCRE2/Java 8/.NET, ECMAScript, Python

英文:

You can use lookahead to assert that an answer is not followed by a Key: line that has the same key:

^                           # Match an answer that starts at the start of line,
(?&lt;key&gt;[ABCD])              # then a key, which we capture,
\.\s+.+                     # a dot, some spaces and everything else to the end of line,
(?=                         # followed by
  (?:\n[ABCD]\.\s+.+){0,3}  # 0 to 3 more answers, then
  \n                        # a blank line,
  \nKey:\s+                 # then &#39;Key:&#39; succeeded by some spaces and
  (?!\k&lt;key&gt;)               # something that is not the same as the key we captured.
)

The same regex can also be used in Python, with a couple of minor differences in syntax:

^(?P&lt;key&gt;[ABCD])\.\s+.+
(?=
  (?:\n[ABCD]\.\s+.+){0,3}
  \n
  \nKey:\s+(?!(?P=key))
)

Try it on regex101.com: PCRE/PCRE2/Java 8/.NET, ECMAScript, Python.

huangapple
  • 本文由 发表于 2023年5月28日 18:49:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76351093.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定