Go正则表达式:查找出现后的下一个项

huangapple go评论67阅读模式
英文:

Go regexp: finding next item after an occurence

问题

我是一个Go语言初学者,最近一直在使用正则表达式。例如:

r, _ := regexp.Compile(`\* \* \*`)
r2 := r.ReplaceAll(b, []byte("<hr>"))

(将所有的* * *替换为<hr>

有一件事我不知道如何做,那就是找到出现后的“下一个”项目。在JavaScript/jQuery中,我习惯这样做:

$("#input-content p:has(br)").next('p').doStuff()

(找到一个包含br标签的p标签后面的下一个p标签)。在Go语言中,实现相同功能的最简单方法是什么?比如,找到* * *后面的下一行?

* * *

匹配这一行
英文:

I'm a Go beginner and I've been playing with regexes. Example:

r, _ := regexp.Compile(`\* \* \*`)
r2 := r.ReplaceAll(b, []byte("<hr>"))

(Replace all * * *s for <hr>s)

One thing that I have no idea how to do is to find the next item after an occurence. In JavaScript/jQuery I used to do this:

$("#input-content p:has(br)").next('p').doStuff()

(Find the next p tag after a p tag that has a br tag inside).

What's the simplest way to accomplish the same in Go? Say, finding the next line after * * * ?

> * * *
>
> Match this line

答案1

得分: 1

你需要使用捕获组来获取该句子的内容:

package main

import "fmt"
import "regexp"

func main() {

    str := `
* * *

Match this line
`   
    r, _ := regexp.Compile(`\* \* \*\n.*\n(.*)`)

    fmt.Println(r.FindStringSubmatch(str)[1])
}

输出:

Match this line

解释:

\* \* \*    匹配包含星号的第一行。
\n          换行符。
.*          第二行。可以是任何内容(可能是空行)
\n          换行符
(           开始捕获组
.*          感兴趣的内容
)           结束捕获组

在评论中,你问如何将第三行替换为<hr/>。在这种情况下,我会使用两个捕获组 - 一个用于感兴趣行之前的部分,一个用于行本身。在替换模式中,你可以使用$1来在结果中使用第一个捕获组的值。

示例:

package main

import "fmt"
import "regexp"

func main() {

    str := `
* * * 

Match this line
`   
    r, _ := regexp.Compile(`(\* \* \*\n.*\n)(.*)`)

    str = string(r.ReplaceAll([]byte(str), []byte("$1<hr/>")))

    fmt.Println(str)
}
英文:

You would need to use a capturing group to grap the contents of that sentence:

package main

import &quot;fmt&quot;
import &quot;regexp&quot;

func main() {

    str := `
* * *

Match this line
`   
    r, _ := regexp.Compile(`\* \* \*\n.*\n(.*)`)

    fmt.Println(r.FindStringSubmatch(str)[1])
}

Output:

<!-- language: none -->

Match this line

Explanation:

<!-- language: none -->

\* \* \*    Matches the first line containing the asterisks.
\n          A newline.
.*          Second line. Can be anything (Likely the line is simply empty)
\n          A newline
(           Start of capturing group
.*          The content of interest
)           End of capturing group

In comments you asked how to replace the third line by &lt;hr/&gt;. In this case I would use two capturing groups - one for the part before the line of interest and one for the line itself. In the replacement pattern you can then use $1 to use the value of the first capturing group in the result.

Example:

package main

import &quot;fmt&quot;
import &quot;regexp&quot;

func main() {

    str := `
* * * 

Match this line
`   
    r, _ := regexp.Compile(`(\* \* \*\n.*\n)(.*)`)

    str = string(r.ReplaceAll([]byte(str), []byte(&quot;$1&lt;hr/&gt;&quot;)))

    fmt.Println(str)
}

huangapple
  • 本文由 发表于 2015年2月24日 21:49:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/28697533.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定