Go语言从字符串中获取匹配的子串。

huangapple go评论87阅读模式
英文:

Go lang get matching substring from string

问题

我正在尝试从字符串中提取所有在引号之间的单词。

这是我的当前代码:

func StrExtract(word string) []string {
  r, _ := regexp.Compile(`".*"`)
  result := r.FindAllString(word, -1)
  RemoveDuplicates(&result)
  return result
}

这里测试代码

输入如下:

`Hi guys, this is a "test" and a "demo" ok?`

我得到的输出是:

["test" and a "demo"]

但我想要得到的是:

[test demo]

请帮我修复这个问题,或者提供更好的解决方案。

英文:

I'm trying to extract all words from a string which are between quotes.

Here's my current code:

func StrExtract(word string) []string {
  r, _ := regexp.Compile(`".*"`)
  result := r.FindAllString(word, -1)
  RemoveDuplicates(&result)
  return (result)
}

Test the code here

With an input like:

`Hi guys, this is a "test" and a "demo" ok?`

I get the output:

["test" and a "demo"]

But I'd like to get:

[test demo]

Please help me fix this, or suggest better alternatives.

答案1

得分: 2

你可以简单地添加一个懒惰量词.*?,即正则表达式".*?",如果你想保持简单。你之所以得到"test" and a "demo",是因为.*是贪婪的,它会尽可能匹配尽量多的文本(因此,它实际上匹配了test之前和demo之后的",忽略了它们之间的其他引号)。

通常,更好但在某些方面稍微复杂一些的方法是使用字符类"[^"]*",禁止匹配引号之间的引号。这也可能导致一些其他行为,比如包括换行符(在这种情况下,你也可以禁用它们[^"\n],或者你实际上想要这种情况)。

由于你还想去掉引号,需要进行一些额外的操作。你可以使用回顾后发断言:(?<=")[^"]*(?="),或者使用捕获组:"(.*?)""([^"]*)"。如果你选择使用捕获组的方法,你必须使用捕获组,而不是整个匹配。

英文:

You can just add a lazy quantifier .*?, ".*?" being the regex, if you want to keep it simple. The reason you are getting "test" and a "demo" is because just .* is greedy and matches as much text as possible (therefore, it actually matches the " before test and after demo, ignoring the fact that there are other quotes in between).

Normally a better but in some ways slightly more complicated way to do this is using character classes "[^"]*", disabling matching quotes in between. This can also cause some other behaviors like including newlines (in which case you can also disable them [^"\n], or perhaps you actually want such a case)

Since you want to also not have the quotes some additional things need to be done. You can do that with either lookarounds: (?<=")[^"]*(?="), or with capture groups: "(.*?)" and "([^"]*)". If you choose the capture group route, you have to use the capture group, not whole matches.

答案2

得分: 2

正则表达式:

"(.*?)"

这是一个在线演示:
https://regex101.com/r/sI4tA9/1

现在你需要做的就是将匹配项连接起来。不幸的是,我对go不太了解,所以在这种情况下无法帮助你。

英文:

Regex:

"(.*?)"

Here is an online demo:
https://regex101.com/r/sI4tA9/1

All you have to do now is to join matches. Unfortunately I'm not so into go that's why I can't help you in that case

huangapple
  • 本文由 发表于 2015年7月24日 18:03:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/31607652.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定