Go Regexp to Match Characters Between

huangapple go评论92阅读模式
英文:

Go Regexp to Match Characters Between

问题

我有一段内容,想要从字符串中删除它。

s := Hello! <something>My friend</something>this is some <b>content</b>.

我想要替换 <b>content</b><something>My friend</something>,使得字符串变成

Hello! this is some .

所以基本上,我想要删除 <.*> 之间的任何内容。

但问题是,正则表达式匹配到了 <something>My friend</something> this is some <b>content</b>,因为golang将第一个 < 匹配到了最后一个 >

英文:

I have content I am trying to remove from a string

s:=`Hello! <something>My friend</something>this is some <b>content</b>.`

I want to be able to replace <b>content</b> and <something>My friend</something> so that the string is then

`Hello! this is some .`

So basically, I want to be able to remove anything between <.*>

But the problem is that the regex matches <something>My friend</something> this is some <b>content</b> because golang is matching the first < to the very last >

答案1

得分: 4

* 是一个贪婪操作符,意味着它会尽可能匹配多的内容,同时仍然允许正则表达式的其余部分匹配。在这种情况下,我建议使用否定字符类,因为不支持反向引用。

s := "Hello! <something>My friend</something>this is some <b>content</b>."
re := regexp.MustCompile("<[^/]*/[^>]*>")
fmt.Println(re.ReplaceAllString(s, ""))

Go Playground

英文:

* is a greedy operator meaning it will match as much as it can and still allow the remainder of the regular expression to match. In this case, I would suggest using negated character classes since backreferences are not supported.

s := &quot;Hello! &lt;something&gt;My friend&lt;/something&gt;this is some &lt;b&gt;content&lt;/b&gt;.&quot;
re := regexp.MustCompile(&quot;&lt;[^/]*/[^&gt;]*&gt;&quot;)
fmt.Println(re.ReplaceAllString(s, &quot;&quot;))   

<kbd>Go Playground</kbd>

答案2

得分: 2

Go的正则表达式不支持回溯,所以你不能像在Perl中那样使用<(.*?)>.*?</\1>

然而,如果你不关心闭合标签是否匹配,你可以使用:

<.*?/.*?>

刚刚看到你的更新,.*是一个贪婪操作符,它会匹配之间的所有内容,你需要使用非贪婪匹配(也就是.*?)。

play

英文:

Go's regexp doesn't have backtracking so you can't use &lt;(.*?)&gt;.*?&lt;/\1&gt; like you would do in perl.

However if you don't care if the closing tag matches you can use:

&lt;.*?/.*?&gt;

Just saw your update, .* is a greedy operator, it will match everything in between, you have to use non-greedy matching (aka .*?).

<kbd>play</kbd>

huangapple
  • 本文由 发表于 2014年10月22日 02:24:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/26493620.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定