Go正则表达式:匹配三个星号

huangapple go评论70阅读模式
英文:

Go regexp: match three asterisks

问题

所以我做了这个:

r, _ := regexp.Compile("* * *")
r2 := r.ReplaceAll(b, []byte("<hr>"))

然后得到了:

panic: 运行时错误:无效的内存地址或空指针引用

所以我想我必须对它们进行转义:

r, _ := regexp.Compile("\\* \\* \\*")

但是得到了未知的转义序列错误

我是Go初学者。我做错了什么?

英文:

So I did this:

r, _ := regexp.Compile(&quot;* * *&quot;)
r2 := r.ReplaceAll(b, []byte(&quot;&lt;hr&gt;&quot;))

and got:

panic: runtime error: invalid memory address or nil pointer dereference

So I figured I had to escape them:

r, _ := regexp.Compile(&quot;\* \* \*&quot;)

But got unknown escape secuence

I'm a Go Beginner. What am I doing wrong?

答案1

得分: 5

你没有检查错误。

regexp.Compile 给出了两个结果:

  1. 编译后的模式(或 nil
  2. 编译模式时的错误(或 nil

你忽略了错误并访问了 nil 的结果。观察以下代码(在 play 上):

r, err := regexp.Compile("* * *")

fmt.Println("r:", r)
fmt.Println("err:", err)

运行这段代码会显示出确实存在一个错误。错误信息是:

> error parsing regexp: missing argument to repetition operator: *

所以,是的,你是正确的,你必须转义重复操作符 *。你尝试了以下代码:

r, err := regexp.Compile("\* \* \*")

结果你得到了编译器的以下错误:

> unknown escape sequence: *

由于有许多转义序列(如 \n\r)用于特殊字符,而这些字符在键盘上没有,但你想在字符串中使用,编译器会尝试插入这些字符。\* 不是一个有效的转义序列,因此编译器无法进行替换。你想要做的是转义转义序列,以便正则表达式解析器可以正常工作。

所以,正确的代码是:

r, err := regexp.Compile("\\* \\* \\*")

处理这种怪异情况的最简单方法是使用原始字符串字面量("``")而不是普通引号:

r, err := regexp.Compile(`\* \* \*`)

这些原始字符串完全忽略转义序列。

英文:

You are not checking errors.

regexp.Compile gives you two results:

  1. the compiled pattern (or nil)
  2. the error while compiling the pattern (or nil)

You are ignoring the error and accessing the nil result. Observe (on play):

r, err := regexp.Compile(&quot;* * *&quot;)

fmt.Println(&quot;r:&quot;, r)
fmt.Println(&quot;err:&quot;, err)

Running this code will show you that, indeed there is an error. The error is:

> error parsing regexp: missing argument to repetition operator: *

So yes, you are right, you have to escape the repetition operator *. You tried the following:

r, err := regexp.Compile(&quot;\* \* \*&quot;)

And consequently you got the following error from the compiler:

> unknown escape sequence: *

Since there are a number of escape sequences like \n or \r for special characters that you do not have on your keyboard but want to have in strings, the compiler tries to insert these characters. \* is not a valid escape sequence and thus the compiler fails to do the replacement. What you want to do is to escape the escape sequence so that the regexp parser can do its thing.

So, the correct code is:

r, err := regexp.Compile(&quot;\\* \\* \\*&quot;)

The simplest way of dealing with these kind of quirks is using the raw string literals ("``") instead of normal quotes:

r, err := regexp.Compile(`\* \* \*`)

These raw strings ignore escape sequences altogether.

答案2

得分: 4

尝试转义你的'*'(因为'*'是在re2语法中用于重复的特殊字符)

r, err := regexp.Compile(`\* \* \*`)
// 并且是的,始终检查错误
// 或者至少使用regexp.MustCompile()如果你想要快速失败

请注意使用反引号``作为字符串字面量

英文:

Try escaping your '*' (since '*' is a special character used for repetition in the re2 syntax)

r, err := regexp.Compile(`\* \* \*`)
// and yes, always check the error
// or at least use regexp.MustCompile() if you want to fail fast

Note the use of back quotes `` for the string literal.

答案3

得分: 4

添加到@VonC的答案中,正则表达式并不总是最佳选择,通常比使用strings.*要慢。

对于复杂的表达式,正则表达式确实很棒,但是如果你只想匹配一个字符串并替换它,那么strings.Replacer是一个好选择:

var asterisksReplacer = strings.NewReplacer(`* * *`, `&lt;hr&gt;`)

func main() {
    fmt.Println(asterisksReplacer.Replace(`xxx * * * yyy *-*-* zzz* * *`))
}

<kbd>playground</kbd>

英文:

Adding to @VonC's answer, regexp aren't always the answer and are generally slower than using strings.*.

For a complex expression, sure regexp is awesome, however if you just want to match a string and replace it then, strings.Replacer is the way to go:

var asterisksReplacer = strings.NewReplacer(`* * *`, `&lt;hr&gt;`)

func main() {
    fmt.Println(asterisksReplacer.Replace(`xxx * * * yyy *-*-* zzz* * *`))
}

<kbd>playground</kbd>

huangapple
  • 本文由 发表于 2015年2月23日 21:55:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/28675389.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定