为什么 `str_match` 不像 `regex101` 一样捕获分组?

huangapple go评论57阅读模式
英文:

Why does str_match not capture group the same way regex101 does?

问题

我有这个字符串:

bn = "this_is_a_test_12345.txt"

我想捕获/提取其中的数字部分(`12345`)。在regex101.com上尝试的正则表达式如下:

[![enter image description here][1]][1]

但在R中尝试不起作用:

str_match(bn, ".*(\\d*).*") # 不起作用

str_match(bn, ".*_(\\d*).*") # 起作用(第二列是匹配组)

我认为我可能错过了一些关于贪婪性或其他方面的简单东西,但我不确定...

  [1]: https://i.stack.imgur.com/psLRp.png
英文:

I have this string:

bn = "this_is_a_test_12345.txt"

And I want to capture/extract the numeric part (12345). Trying it on regex101.com works like this:

为什么 `str_match` 不像 `regex101` 一样捕获分组?

Yet doing it in R does not work that way:

str_match(bn, ".*(\\d*).*") # works not

str_match(bn, ".*_(\\d*).*") # works (second column is the matched group)

I think I am missing something very simple about greediness or so, but I am not sure...

答案1

得分: 1

如评论中所提到的,您需要使用?来捕获非贪婪模式:

sub(".*?(\\d+).*", "\\1", bn)
# [1] "12345"
英文:

As mentioned in the comments, you will need the non greedy pattern as captured with ?:

sub(".*?(\\d+).*", "\", bn)
# [1] "12345"

huangapple
  • 本文由 发表于 2023年2月10日 15:23:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75408022.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定