英文:
Go Regexp to Match Characters Between
问题
我有一段内容,想要从字符串中删除它。
s := Hello! <something>My friend</something>this is some <b>content</b>.
我想要替换 <b>content</b>
和 <something>My friend</something>
,使得字符串变成
Hello! this is some .
所以基本上,我想要删除 <.*>
之间的任何内容。
但问题是,正则表达式匹配到了 <something>My friend</something> this is some <b>content</b>
,因为golang将第一个 <
匹配到了最后一个 >
。
英文:
I have content I am trying to remove from a string
s:=`Hello! <something>My friend</something>this is some <b>content</b>.`
I want to be able to replace <b>content</b>
and <something>My friend</something>
so that the string is then
`Hello! this is some .`
So basically, I want to be able to remove anything between <.*>
But the problem is that the regex matches <something>My friend</something> this is some <b>content</b>
because golang is matching the first <
to the very last >
答案1
得分: 4
*
是一个贪婪操作符,意味着它会尽可能匹配多的内容,同时仍然允许正则表达式的其余部分匹配。在这种情况下,我建议使用否定字符类,因为不支持反向引用。
s := "Hello! <something>My friend</something>this is some <b>content</b>."
re := regexp.MustCompile("<[^/]*/[^>]*>")
fmt.Println(re.ReplaceAllString(s, ""))
英文:
*
is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match. In this case, I would suggest using negated character classes since backreferences are not supported.
s := "Hello! <something>My friend</something>this is some <b>content</b>."
re := regexp.MustCompile("<[^/]*/[^>]*>")
fmt.Println(re.ReplaceAllString(s, ""))
答案2
得分: 2
Go的正则表达式不支持回溯,所以你不能像在Perl中那样使用<(.*?)>.*?</\1>
。
然而,如果你不关心闭合标签是否匹配,你可以使用:
<.*?/.*?>
刚刚看到你的更新,.*
是一个贪婪操作符,它会匹配之间的所有内容,你需要使用非贪婪匹配(也就是.*?
)。
英文:
Go's regexp doesn't have backtracking so you can't use <(.*?)>.*?</\1>
like you would do in perl.
However if you don't care if the closing tag matches you can use:
<.*?/.*?>
Just saw your update, .*
is a greedy operator, it will match everything in between, you have to use non-greedy matching (aka .*?
).
<kbd>play</kbd>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论