英文:
Why does str_match not capture group the same way regex101 does?
问题
我有这个字符串:
bn = "this_is_a_test_12345.txt"
我想捕获/提取其中的数字部分(`12345`)。在regex101.com上尝试的正则表达式如下:
[![enter image description here][1]][1]
但在R中尝试不起作用:
str_match(bn, ".*(\\d*).*") # 不起作用
str_match(bn, ".*_(\\d*).*") # 起作用(第二列是匹配组)
我认为我可能错过了一些关于贪婪性或其他方面的简单东西,但我不确定...
[1]: https://i.stack.imgur.com/psLRp.png
英文:
I have this string:
bn = "this_is_a_test_12345.txt"
And I want to capture/extract the numeric part (12345
). Trying it on regex101.com works like this:
Yet doing it in R does not work that way:
str_match(bn, ".*(\\d*).*") # works not
str_match(bn, ".*_(\\d*).*") # works (second column is the matched group)
I think I am missing something very simple about greediness or so, but I am not sure...
答案1
得分: 1
如评论中所提到的,您需要使用?
来捕获非贪婪模式:
sub(".*?(\\d+).*", "\\1", bn)
# [1] "12345"
英文:
As mentioned in the comments, you will need the non greedy pattern as captured with ?
:
sub(".*?(\\d+).*", "\", bn)
# [1] "12345"
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论