2023年4月20日 04:19:53go评论68阅读模式

英文:

Regex (Python): Matching Integers not Preceded by Character

问题

Based on some string of numbers:

(30123:424302) 123 #4324:#34123

如何获取仅不紧随在“#”之前的数字？我已经找到了如何获取紧随在“#”之前的数字（\#+\d+），但我需要相反的情况。我能够将所有\d+分组，然后根据我的模式进行反向匹配吗？

澄清一下，我需要上面示例中的 30123，424302 和 123。

英文:

Based on some string of numbers:

(30123:424302) 123 #4324:#34123

How can I obtain only the numbers that are NOT immediately preceded by "#"? I have found how to get those numbers preceded by "#" (\#+\d+) but I need the opposite. Can I group all \d+ and then inverse match based on the pattern I have somehow?

To clarify, I need 30123, 424302, and 123 in the above example.

答案1

得分: 4

你可以尝试这个正则表达式，它使用了负向后行断言和单词边界：

(?<!#)\b\d+

正则表达式演示

正则表达式详细信息：

(?<!#): 负向后行断言条件，当前置位置出现 # 时不匹配
\b 单词边界
\d+: 匹配 1 个或多个数字

英文:

You may try this regex with a negative lookbehind + word boundary:

(?&lt;!#)\b\d+

RegEx Demo

RegEx Details:

(?<!#): A negative lookbehind condition to fail the match when # appears on preceding position
\b Word boundary
\d+: Match 1+ digits

答案2

得分: 1

以下是您要翻译的内容：

"你需要

(?&lt;![#\d])\d+

请参阅正则表达式演示。

模式详细信息

(?<![#\d]) - 一个负向回顾断言，如果当前位置之前有数字或#字符，匹配将失败
\d+ - 一个或多个数字。

请参阅Python演示：

import re
text = &quot;(30123:424302) 123 #4324:#34123&quot;
print(re.findall(r&quot;(?&lt;![#\d])\d+&quot;, text))
# =&gt; [&#39;30123&#39;, &#39;424302&#39;, &#39;123&#39;]

如果您需要以最初的方式“反转”某些内容，您可以匹配您不想要的内容，然后匹配并捕获您想要的内容，在收集匹配后，从结果列表中删除所有空值：

import re
text = &quot;(30123:424302) 123 #4324:#34123&quot;
print(list(filter(None, re.findall(r&quot;#\d+|(\d+)&quot;, text))))

请参阅此Python演示。

正如您所见，#\d+会消耗#后面的所有数字（即在不希望的上下文中），而(\d+)则提取了正确的值。

英文:

You need

(?&lt;![#\d])\d+

See the regex demo.

Pattern details

(?<![#\d]) - a negative lookbehind that fails the match if there is a digit or a # char immediately before the current position
\d+ - one or more digits.

See the Python demo:

import re
text = &quot;(30123:424302) 123 #4324:#34123&quot;
print(re.findall(r&quot;(?&lt;![#\d])\d+&quot;, text))
# =&gt; [&#39;30123&#39;, &#39;424302&#39;, &#39;123&#39;]

And if you need to "reverse" something the way you originally thought of, you can match what you do not want, and then match and capture what you want, and after collecting the matches, remove all empty values from the resulting list:

import re
text = &quot;(30123:424302) 123 #4324:#34123&quot;
print(list(filter(None, re.findall(r&quot;#\d+|(\d+)&quot;, text))))

See this Python demo.

As you can see, #\d+ consumed all digits after the # (i.e. in the undesired context) and (\d+) fetched the right values.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Regex（Python）：匹配不以字符前导的整数

问题

答案1

答案2

无效字符’d’，寻找值的开头。

拖放功能适用于移动设备

Anaconda错误: 使用pip更新/安装库时出现无效的分发-atplotlib

在Python中，“False == False != True”为True，但在JavaScript中为false。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论