英文:
Comparing strings in Go
问题
我正在尝试找到字符串中命名捕获组的开始,以创建一个简单的解析器(参见相关问题)。为了做到这一点,extract
函数会将last4
变量中的最后四个字符记住。如果最后四个字符等于“(?P<”,那么它就是一个捕获组的开始:
package main
import "fmt"
const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`
func main() {
extract(sample)
}
func extract(regex string) {
last4 := new([4]int32)
for _, c := range regex {
last4[0], last4[1], last4[2], last4[3] = last4[1], last4[2], last4[3], c
last4String := fmt.Sprintf("%c%c%c%c\n", last4[0], last4[1], last4[2], last4[3])
if last4String == "(?P<" {
fmt.Print("捕获组的开始")
}
}
}
http://play.golang.org/p/pqA-wCuvux
但是这段代码什么都没有打印!last4String == "(?P<"
永远不为真,尽管如果我在循环内打印last4String
,这个子字符串会出现在输出中。那么如何在Go中比较字符串呢?
还有没有比fmt.Sprintf("%c%c%c%c\n", last4[0], last4[1], last4[2], last4[3])
更优雅的将int32数组转换为字符串的方法?
还有其他可以改进的地方吗?我的代码在我看来有些不够优雅。
英文:
I'm trying to find the begin of a named capturing groups in a string to create a simple parser (see related question). To do this the extract
function remembers the last for characters in the last4
variable. If the last 4 characters are equal to "(?P<" it is the beginning of a capturing group:
package main
import "fmt"
const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`
func main() {
extract(sample)
}
func extract(regex string) {
last4 := new([4]int32)
for _, c := range regex {
last4[0], last4[1], last4[2], last4[3] = last4[1], last4[2], last4[3], c
last4String := fmt.Sprintf("%c%c%c%c\n", last4[0], last4[1], last4[2], last4[3])
if last4String == "(?P<" {
fmt.Print("start of capturing group")
}
}
}
http://play.golang.org/p/pqA-wCuvux
But this code prints nothing! last4String == "(?P<"
is never true, although this substrin appears in the output if I print last4String
inside the loop. How to compare strings in Go then?
And is there a more elegant way to convert an int32 array to a string than fmt.Sprintf("%c%c%c%c\n", last4[0], last4[1], last4[2], last4[3])
?
Anything else that could be better? My code looks somewhat inelegant to me.
答案1
得分: 3
如果不是为了自我教育或类似的目的,你可能想要使用标准库中现有的RE解析器,然后“遍历”AST来执行所需的操作。
func Parse(s string, flags Flags) (*Regexp, error)
Parse解析一个正则表达式字符串s,由指定的Flags控制,并返回一个正则表达式解析树。语法在包regexp的顶级注释中有描述。
甚至还有一个辅助函数可以完成你的任务。
EDIT1: 修复了你的代码:
package main
import "fmt"
const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`
func main() {
extract(sample)
}
func extract(regex string) {
var last4 [4]int32
for _, c := range regex {
last4[0], last4[1], last4[2], last4[3] = last4[1], last4[2], last4[3], c
last4String := fmt.Sprintf("%c%c%c%c", last4[0], last4[1], last4[2], last4[3])
if last4String == "(?P<" {
fmt.Println("捕获组的开始")
}
}
}
(也可以在这里找到)
EDIT2: 重写了你的代码:
package main
import (
"fmt"
"strings"
)
const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`
func main() {
extract(sample)
}
func extract(regex string) {
start := 0
for {
i := strings.Index(regex[start:], "(?P<")
if i < 0 {
break
}
fmt.Printf("捕获组的开始位置 @ %d\n", start+i)
start += i + 1
}
}
(也可以在这里找到)
英文:
If it's not for self-education or similar, you probably want to use the existing RE parser in the standard library and then "walk" the AST to do whatever required.
func Parse(s string, flags Flags) (*Regexp, error)
> Parse parses a regular expression string s, controlled by the specified Flags,
> and returns a regular expression parse tree. The syntax is described in the
> top-level comment for package regexp.
There's even a helper for your task.
EDIT1: Your code repaired:
package main
import "fmt"
const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`
func main() {
extract(sample)
}
func extract(regex string) {
var last4 [4]int32
for _, c := range regex {
last4[0], last4[1], last4[2], last4[3] = last4[1], last4[2], last4[3], c
last4String := fmt.Sprintf("%c%c%c%c", last4[0], last4[1], last4[2], last4[3])
if last4String == "(?P<" {
fmt.Println("start of capturing group")
}
}
}
(Also here)
EDIT2: Your code rewritten:
package main
import (
"fmt"
"strings"
)
const sample string = `/(?P<country>m((a|b).+)(x|y)n)/(?P<city>.+)`
func main() {
extract(sample)
}
func extract(regex string) {
start := 0
for {
i := strings.Index(regex[start:], "(?P<")
if i < 0 {
break
}
fmt.Printf("start of capturing group @ %d\n", start+i)
start += i + 1
}
}
(Also here)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论