How to write simple regex in golang?

huangapple go评论78阅读模式
英文:

How to write simple regex in golang?

问题

我正在尝试编写一个正则表达式,该表达式返回以点开头并且直到第一个空格之间的子字符串。但是我对正则表达式还不熟悉,所以我尝试了一些代码,但完全不起作用:

package main

import "fmt"
import "regexp"

func main() {
    re := regexp.MustCompile("\\.* ")
    fmt.Printf(re.FindString(".d 1000=11,12")) // 应该返回 d
    fmt.Printf(re.FindString("e 2000=11"))     // 应该返回空字符串 ""
    fmt.Printf(re.FindString(".e2000=11"))     // 应该返回空字符串 ""
}

这段代码在 Golang 中只是输出了 3 个空格。我做错了什么?

英文:

I am trying to write regexp that returns the substring for string that begins with dot and until first space. But I am new in regular expressions, so I tried something
like that and it doesn't work at all:

package main

import "fmt"
import "regexp"

func main() {
    re := regexp.MustCompile("\\.* ")
    fmt.Printf(re.FindString(".d 1000=11,12")) // Must return d
    fmt.Printf(re.FindString("e 2000=11"))     // Must return nothing or ""
    fmt.Printf(re.FindString(".e2000=11"))     // Must return nothing or ""
}

this code just white 3 white space in golang. What I am doing wrong?

答案1

得分: 16

在glob匹配中,*是通配符,但在正则表达式中,.才是通配符,而*表示重复0次或多次。你可能想要使用以下正则表达式:

re := regexp.MustCompile("\\..* ")

go playground

但你可能会注意到它也返回了点和空格。你可以使用FindStringSubmatch并使用捕获组来修复这个问题,你可以使用反引号,这样你就不必对字符进行双重转义:

re := regexp.MustCompile(`\.(.*) `)
match := re.FindStringSubmatch(".d 1000=11,12")
if len(match) != 0 {
    fmt.Printf("1. %s\n", match[1])
}

go playground

尽管我更喜欢使用\S*(匹配非空格字符)而不是.*来进行匹配,因为它会减少可能的回溯:

re := regexp.MustCompile(`\.(\S*) `)

go playground

英文:

While * is the wildcard in glob matching, it's not the wildcard in regex. In regex, . is the wildcard and * means repetition of 0 or more times. You probably want:

re := regexp.MustCompile("\\..* ")

go playground

But you might notice that it's also returning the dot and space. You can use FindStringSubmatch and use a capture group to fix this, and you can use backsticks so that you don't have to double escape things:

re := regexp.MustCompile(`\.(.*) `)
match := re.FindStringSubmatch(".d 1000=11,12")
if len(match) != 0 {fmt.Printf("1. %s\n", match[1])}

go playground

Though I would prefer using \S* (matches non-space characters) instead of .* to get this match, since it'll reduce the possible backtracking:

re := regexp.MustCompile(`\.(\S*) `)

go playground

答案2

得分: 0

你写的第一个字符\\表示你要转义反斜杠,所以你期望反斜杠作为第一个字符。你应该写成^\..*?

  • ^ - 表示开头
  • \. - 表示转义点(所以与上面的一个字符一起表示你期望点作为第一个字符)
  • .*? - 任意字符(点),任意数量(星号),非贪婪模式(问号),直到空格(空格)

非贪婪模式意味着它会在第一个空格处停止,而不是在最后一个空格处停止。

英文:

The first 2 characters you write \\ mean that you're escaping backslash, so you're expecting backslash as the first character. You should write ^\..*? instead:

  • ^ - means beginning
  • \. - means escaping dot (so together with one above means that you expect dot as the first character)
  • .*? - any character (dot), any number of them (asterisk), not greedy (question mark) until space (space)

Non-greedy means that it will stop at first space not at the last one

huangapple
  • 本文由 发表于 2014年3月9日 20:35:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/22282229.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定