英文:
How do I match only once using a regex lookbehind?
问题
以下是您要翻译的内容:
我想构建一个正则表达式,只匹配直接在条件之前的内容,就像这样:
Question: What is the capital city of France?
A. Berlin
B. Paris
C. Rome
D. Madrid
Key: B
Question: Who is credited with inventing the World Wide Web?
A. Steve Jobs
B. Bill Gates
C. Tim Berners-Lee
D. Mark Zuckerberg
Key: C
我想匹配的是:
A. Berlin
C. Rome
D. Madrid
A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg
Key 是 A -> 匹配: B, C, D。
Key 是 B -> 匹配: A, C, D。
Key 是 C -> 匹配: A, B, D。
Key 是 D -> 匹配: A, B, C。
这是我为 Key 是 C 时的正则表达式:
(?!Key: C)^[ABD].*
但它会匹配:
A. Berlin
B. Paris
D. Madrid
A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg
是否有人能提供解决方案或指导如何解决这个问题?
英文:
I want to build a regex that only matches stuff directly before a condition, like this:
Question: What is the capital city of France?
A. Berlin
B. Paris
C. Rome
D. Madrid
Key: B
Question: Who is credited with inventing the World Wide Web?
A. Steve Jobs
B. Bill Gates
C. Tim Berners-Lee
D. Mark Zuckerberg
Key: C
I want to match:
A. Berlin
C. Rome
D. Madrid
A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg
Key is A -> match: B, C, D.
Key is B -> match: A, C, D.
Key is C -> match: A, B, D.
Key is D -> match: A, B, C.
This is my regex for when the key is C:
(?<!Key: C)^[ABD].*
But it will match:
A. Berlin
B. Paris
D. Madrid
A. Steve Jobs
B. Bill Gates
D. Mark Zuckerberg
Can anyone suggest a solution or offer guidance on how to troubleshoot this problem?
答案1
得分: 3
你可以使用前瞻来断言答案后面不跟着一个具有相同键的 Key:
行:
^ # 匹配以行的开头开始的答案,
(?<key>[ABCD]) # 然后是一个键,我们捕获它,
\.\s+.+ # 一个句点,一些空格和直到行末的所有内容,
(?= # 后面跟着
(?:\n[ABCD]\.\s+.+){0,3} # 0到3个更多的答案,然后
\n # 一个空行,
\nKey:\s+ # 然后是 'Key:' 后跟一些空格和
(?!\\k<key>) # 不与我们捕获的键相同的东西。
)
相同的正则表达式也可以在Python中使用,语法上有一些小差异:
^(?P<key>[ABCD])\.\s+.+
(?=
(?:\n[ABCD]\.\s+.+){0,3}
\n
\nKey:\s+(?!(?P=key))
)
在 regex101.com 上尝试它:PCRE/PCRE2/Java 8/.NET, ECMAScript, Python。
英文:
You can use lookahead to assert that an answer is not followed by a Key:
line that has the same key:
^ # Match an answer that starts at the start of line,
(?<key>[ABCD]) # then a key, which we capture,
\.\s+.+ # a dot, some spaces and everything else to the end of line,
(?= # followed by
(?:\n[ABCD]\.\s+.+){0,3} # 0 to 3 more answers, then
\n # a blank line,
\nKey:\s+ # then 'Key:' succeeded by some spaces and
(?!\k<key>) # something that is not the same as the key we captured.
)
The same regex can also be used in Python, with a couple of minor differences in syntax:
^(?P<key>[ABCD])\.\s+.+
(?=
(?:\n[ABCD]\.\s+.+){0,3}
\n
\nKey:\s+(?!(?P=key))
)
Try it on regex101.com: PCRE/PCRE2/Java 8/.NET, ECMAScript, Python.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论