英文:
Regex "before text matching" in GoLang
问题
我有一段 JavaScript 代码,我想用 GoLang 替换它。逻辑要求我只在字符串中的分号 (;) 后面跟着 "I" 或 "D" 时进行分割:
I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;
在 JavaScript 中,我使用以下代码实现:
/;(?=[ID]|$)/
我了解到 GoLang 使用的是这个正则表达式库:
https://github.com/google/re2/wiki/Syntax
该库明确显示上述语法(称为“before text matching re”)不受支持。
在 GoLang 中,如何以正确的方式实现相同的结果呢?
英文:
I have a piece of JavaScript code that I'm trying to replace with GoLang. The logic requires me to split the following string on ";" only when followed by "I" or "D":
I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;
In JavaScript I accomplish this using:
/;(?=[ID]|$)/
My understanding is that GoLang uses this regex lib
https://github.com/google/re2/wiki/Syntax
which clearly shows the above syntax (called before text matching re) as not supported.
What would be the correct way of achieving the same result in GoLang?
答案1
得分: 4
你可以“反转”正则表达式以匹配你需要的字符串。你想匹配除了;之外的任意1个或多个字符,后面跟着一个不是I或D的;。
使用以下正则表达式:
[^;]+(?:;[^ID;][^;]*)*
详细说明:
[^;]+- 除了;之外的1个或多个字符(?:;[^ID;][^;]*)*- 零个或多个序列:;- 一个;[^ID;]- 除了I、D或;之外的字符(为了不匹配空值)[^;]*- 除了;之外的零个或多个字符
请参见正则表达式演示。
示例代码:
package main
import (
    "regexp"
    "fmt"
)
func main() {
    var re = regexp.MustCompile(`[^;]+(?:;[^ID;][^;]*)*`)
    var str = `I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;`
    
    for _, match := range re.FindAllString(str, -1) {
        fmt.Println(match)
    }
}
输出结果:
I.E.viewability:-2
D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
D.G.city:Burnaby
D.G.zip:V5C
D.G.region:BC
D.G.E.country_code2:CA
英文:
You may "reverse" the regex to match the strings you need. You want to match any 1+ chars other than ; followed with ; that are not followed with I or D.
Use
[^;]+(?:;[^ID;][^;]*)*
See the regex demo
Details:
[^;]+- 1 or more chars other than;(?:;[^ID;][^;]*)*- zero or more sequences of:;- a;[^ID;]- a char other thanI,Dor;(that is in order not to match empty values)[^;]*- zero or more chars other than;
See a Go demo.
package main
import (
    "regexp"
    "fmt"
)
func main() {
    var re = regexp.MustCompile(`[^;]+(?:;[^ID;][^;]*)*`)
    var str = `I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;`
    
    for _, match := range re.FindAllString(str, -1) {
        fmt.Println(match)
    }
}
Output:
I.E.viewability:-2
D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
D.G.city:Burnaby
D.G.zip:V5C
D.G.region:BC
D.G.E.country_code2:CA
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论