英文:
Go equivalent to PHP preg_match
问题
我有一个小的PHP脚本,用于遍历我的Apache日志文件,并且我正在尝试将这个脚本转换为Go语言。然而,我在寻找一个与PHP函数preg_match
相当的好方法时遇到了一些困难。
在我的PHP脚本中,我对日志文件中的每一行运行preg_match
,像这样:
preg_match('/([.0-9]+) .*?\[([0-9a-zA-Z:\/+ ]+)\].*?"[A-Z]+ \/([^\/ ]+)\/([a-zA-Z0-9\-.]+).*" ([0-9]{3}) .*"(.*?)"/', $line, $matches)
在这个日志上运行这个表达式:
>100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"
返回以下数组(我只对[1-6]感兴趣):
Array
(
[0] => 100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"
[1] => 100.100.100.100
[2] => 23/Feb/2015:03:03:56 +0100
[3] => folder
[4] => file.mp3
[5] => 206
[6] => AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)
)
所以我的问题是,在Go语言中是否有一个很好的等价物?我尝试了一些不同的正则表达式方法,但似乎找不到适合我的方法。
谢谢
英文:
I have a small PHP script that runs through my apache log - and i'm trying to convert this script to Go. However I'm having some difficulties finding a good equivalent to the PHP function preg_match
.
In my PHP script I run a preg_match
on each line in the log file like this:
preg_match('/([.0-9]+) .*?\[([0-9a-zA-Z:\/+ ]+)\].*?"[A-Z]+ \/([^\/ ]+)\/([a-zA-Z0-9\-.]+).*" ([0-9]{3}) .*"(.*?)"$/', $line, $matches)
Running this expression on this log:
>100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"
Returns the following array(where I am only really interested in [1-6]:
Array
(
[0] => 100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"
[1] => 100.100.100.100
[2] => 23/Feb/2015:03:03:56 +0100
[3] => folder
[4] => file.mp3
[5] => 206
[6] => AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)
)
So my question is - is there a good equivalent to this in Go? I have tried some of the different regexp methods but can't seem to find one thats working for me.
Thanks
答案1
得分: 8
首先,你需要知道可能需要修改正则表达式模式本身,因为Go的正则表达式引擎与PHP的正则表达式引擎的行为不完全相同。它们都使用PCRE正则表达式,其中PHP实现了比Go更多的功能。然而,你提供的示例中的模式在Go中应该可以正常工作,无需修改。
下面是一个在Go中工作方式类似于PHP的preg_match()
的示例程序:
package main
import "fmt"
import "regexp"
func main() {
str := `100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"`
r, _ := regexp.Compile(`([.0-9]+) .*?\[([0-9a-zA-Z:\/+ ]+)\].*?"[A-Z]+ \/([^\/ ]+)\/([a-zA-Z0-9\-.]+).*" ([0-9]{3}) .*"(.*?)"$`)
// 使用FindStringSubmatch函数可以访问各个捕获组
for index, match := range r.FindStringSubmatch(str) {
fmt.Printf("[%d] %s\n", index, match)
}
}
输出结果:
[0] 100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"
[1] 100.100.100.100
[2] 23/Feb/2015:03:03:56 +0100
[3] folder
[4] file.mp3
[5] 206
[6] AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)
请参阅关于Go正则表达式的手册:http://golang.org/pkg/regexp/
英文:
First you need to know that you might need to modify the regex pattern itself, since go's regex engine does not behave exactly the same as PHP's regex engine. Both are using PCRE regexes where PHP implements more features than go. However, your pattern from the example should work in go without modifications.
Here comes an example program in go that works like PHP's preg_match()
:
package main
import "fmt"
import "regexp"
func main() {
str := `100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"`
r, _ := regexp.Compile(`([.0-9]+) .*?\[([0-9a-zA-Z:\/+ ]+)\].*?"[A-Z]+ \/([^\/ ]+)\/([a-zA-Z0-9\-.]+).*" ([0-9]{3}) .*"(.*?)"$`)
// Using FindStringSubmatch you are able to access the
// individual capturing groups
for index, match := range r.FindStringSubmatch(str) {
fmt.Printf("[%d] %s\n", index, match)
}
}
Output:
<!-- language: none -->
[0] 100.100.100.100 - - [23/Feb/2015:03:03:56 +0100] "GET /folder/file.mp3 HTTP/1.1" 206 5637064 "-" "AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)"
[1] 100.100.100.100
[2] 23/Feb/2015:03:03:56 +0100
[3] folder
[4] file.mp3
[5] 206
[6] AppleCoreMedia/1.0.0.12B466 (iPhone; U; CPU OS 8_1_3 like Mac OS X; da_dk)
Please check the manual about go regexes: http://golang.org/pkg/regexp/
答案2
得分: 0
也许将来有人会用到它。我在这里使用了函数regexp.FindAllString(),这是一个示例:https://go.dev/play/p/Kh7uV55J1Re
package main
import (
"fmt"
"regexp"
)
func main() {
regex := regexp.MustCompile(`"(pass[a-zA-Z\_\-]+|pwd?[a-zA-Z\_\-]+)":\s?"?[0-9a-zA-Z\;\+\*\-\_"\'\@\!\#\\\~$\%\^\&\(\)\:\;]+"?,?`)
jsonStr := `{"a": {"erd": {"dd": false, "wsr": 0, "dddd": 8, "tttt": "dddddd", "edfgg": 15, "ddddddddd": "wwww"}, "jjj": {"b": "e"}, "qqq": {"wwww": "yyyy1", "wwwwwwwww": 1}, "res": {"f": 0, "er": 5, "ff": 1, "rr": 0, "rer": 3}}, "d": {"re": {"rd": 100, "url": "bug", "timeout": 10000}, "nug": {"jun": 100, "rew": 10001, "url": "car"}, "oldsc": {"dot": 10001, "url": "rop", "link": 100}, "wwwwe": {"l": 10001, "qaq": 2000, "sss": 100, "wwww": "fff"}}, "ff": {"edf": "^/[^/]$", "ffff": "[^/]$"}, "les": {"er": "boo", "nope": "ro"}, "gggg": {"ggg": {"pwd": "ttt", "trf": 1000, "gggg": 0, "pwdPet": "wddd", "password": "fff;f"}, "ttt": {"ttt": null, "ttth": {}, "tttt": {"ttt": "ttttt", "ggggg": "ttt", "ggggtg": 345, "password": "guest"}}, "tsff": {"ggg": 56, "hfgg": "", "tthhl": {"ffg": 1000000, "ttt": 10000, "tyf": 30000000}, "dghgfb": 0, "hhjjhh": "hhgg"}}}`
for index, i := range regex.FindAllString(jsonStr, -1) {
fmt.Println("index:", index, "value:", i)
}
}
英文:
maybe it help by someone in the future. I was using function regexp.FindAllString() here an example: https://go.dev/play/p/Kh7uV55J1Re
package main
import (
"fmt"
"regexp"
)
func main() {
regex := regexp.MustCompile(`"(pass[a-zA-Z\_\-]+|pwd?[a-zA-Z\_\-]+)":\s?"?[0-9a-zA-Z\;\+\*\-\_"\'\@\!\#\\\~$\%\^\&\(\)\:\;]+"?,?`)
jsonStr := `{"a": {"erd": {"dd": false, "wsr": 0, "dddd": 8, "tttt": "dddddd", "edfgg": 15, "ddddddddd": "wwww"}, "jjj": {"b": "e"}, "qqq": {"wwww": "yyyy1", "wwwwwwwww": 1}, "res": {"f": 0, "er": 5, "ff": 1, "rr": 0, "rer": 3}}, "d": {"re": {"rd": 100, "url": "bug", "timeout": 10000}, "nug": {"jun": 100, "rew": 10001, "url": "car"}, "oldsc": {"dot": 10001, "url": "rop", "link": 100}, "wwwwe": {"l": 10001, "qaq": 2000, "sss": 100, "wwww": "fff"}}, "ff": {"edf": "^/[^/]$", "ffff": "[^/]$"}, "les": {"er": "boo", "nope": "ro"}, "gggg": {"ggg": {"pwd": "ttt", "trf": 1000, "gggg": 0, "pwdPet": "wddd", "password": "fff;f"}, "ttt": {"ttt": null, "ttth": {}, "tttt": {"ttt": "ttttt", "ggggg": "ttt", "ggggtg": 345, "password": "guest"}}, "tsff": {"ggg": 56, "hfgg": "", "tthhl": {"ffg": 1000000, "ttt": 10000, "tyf": 30000000}, "dghgfb": 0, "hhjjhh": "hhgg"}}}`
for index, i := range regex.FindAllString(jsonStr, -1) {
fmt.Println("index:", index, "value:", i)
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论