英文:
Regex "before text matching" in GoLang
问题
我有一段 JavaScript 代码,我想用 GoLang 替换它。逻辑要求我只在字符串中的分号 (;) 后面跟着 "I" 或 "D" 时进行分割:
I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;
在 JavaScript 中,我使用以下代码实现:
/;(?=[ID]|$)/
我了解到 GoLang 使用的是这个正则表达式库:
https://github.com/google/re2/wiki/Syntax
该库明确显示上述语法(称为“before text matching re”)不受支持。
在 GoLang 中,如何以正确的方式实现相同的结果呢?
英文:
I have a piece of JavaScript code that I'm trying to replace with GoLang. The logic requires me to split the following string on ";" only when followed by "I" or "D":
I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;
In JavaScript I accomplish this using:
/;(?=[ID]|$)/
My understanding is that GoLang uses this regex lib
https://github.com/google/re2/wiki/Syntax
which clearly shows the above syntax (called before text matching re
) as not supported.
What would be the correct way of achieving the same result in GoLang?
答案1
得分: 4
你可以“反转”正则表达式以匹配你需要的字符串。你想匹配除了;
之外的任意1个或多个字符,后面跟着一个不是I
或D
的;
。
使用以下正则表达式:
[^;]+(?:;[^ID;][^;]*)*
详细说明:
[^;]+
- 除了;
之外的1个或多个字符(?:;[^ID;][^;]*)*
- 零个或多个序列:;
- 一个;
[^ID;]
- 除了I
、D
或;
之外的字符(为了不匹配空值)[^;]*
- 除了;
之外的零个或多个字符
请参见正则表达式演示。
示例代码:
package main
import (
"regexp"
"fmt"
)
func main() {
var re = regexp.MustCompile(`[^;]+(?:;[^ID;][^;]*)*`)
var str = `I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;`
for _, match := range re.FindAllString(str, -1) {
fmt.Println(match)
}
}
输出结果:
I.E.viewability:-2
D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
D.G.city:Burnaby
D.G.zip:V5C
D.G.region:BC
D.G.E.country_code2:CA
英文:
You may "reverse" the regex to match the strings you need. You want to match any 1+ chars other than ;
followed with ;
that are not followed with I
or D
.
Use
[^;]+(?:;[^ID;][^;]*)*
See the regex demo
Details:
[^;]+
- 1 or more chars other than;
(?:;[^ID;][^;]*)*
- zero or more sequences of:;
- a;
[^ID;]
- a char other thanI
,D
or;
(that is in order not to match empty values)[^;]*
- zero or more chars other than;
See a Go demo.
package main
import (
"regexp"
"fmt"
)
func main() {
var re = regexp.MustCompile(`[^;]+(?:;[^ID;][^;]*)*`)
var str = `I.E.viewability:-2;D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36;D.G.city:Burnaby;D.G.zip:V5C;D.G.region:BC;D.G.E.country_code2:CA;`
for _, match := range re.FindAllString(str, -1) {
fmt.Println(match)
}
}
Output:
I.E.viewability:-2
D.ua:Mozilla/5.0 (Linux; Android 7.0; SM-G920W8 Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36
D.G.city:Burnaby
D.G.zip:V5C
D.G.region:BC
D.G.E.country_code2:CA
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论