Go正则表达式:查找出现后的下一个项

huangapple go评论112阅读模式
英文:

Go regexp: finding next item after an occurence

问题

我是一个Go语言初学者,最近一直在使用正则表达式。例如:

  1. r, _ := regexp.Compile(`\* \* \*`)
  2. r2 := r.ReplaceAll(b, []byte("<hr>"))

(将所有的* * *替换为<hr>

有一件事我不知道如何做,那就是找到出现后的“下一个”项目。在JavaScript/jQuery中,我习惯这样做:

  1. $("#input-content p:has(br)").next('p').doStuff()

(找到一个包含br标签的p标签后面的下一个p标签)。在Go语言中,实现相同功能的最简单方法是什么?比如,找到* * *后面的下一行?

  1. * * *
  2. 匹配这一行
英文:

I'm a Go beginner and I've been playing with regexes. Example:

  1. r, _ := regexp.Compile(`\* \* \*`)
  2. r2 := r.ReplaceAll(b, []byte("<hr>"))

(Replace all * * *s for <hr>s)

One thing that I have no idea how to do is to find the next item after an occurence. In JavaScript/jQuery I used to do this:

  1. $("#input-content p:has(br)").next('p').doStuff()

(Find the next p tag after a p tag that has a br tag inside).

What's the simplest way to accomplish the same in Go? Say, finding the next line after * * * ?

> * * *
>
> Match this line

答案1

得分: 1

你需要使用捕获组来获取该句子的内容:

  1. package main
  2. import "fmt"
  3. import "regexp"
  4. func main() {
  5. str := `
  6. * * *
  7. Match this line
  8. `
  9. r, _ := regexp.Compile(`\* \* \*\n.*\n(.*)`)
  10. fmt.Println(r.FindStringSubmatch(str)[1])
  11. }

输出:

  1. Match this line

解释:

  1. \* \* \* 匹配包含星号的第一行。
  2. \n 换行符。
  3. .* 第二行。可以是任何内容(可能是空行)
  4. \n 换行符
  5. ( 开始捕获组
  6. .* 感兴趣的内容
  7. ) 结束捕获组

在评论中,你问如何将第三行替换为<hr/>。在这种情况下,我会使用两个捕获组 - 一个用于感兴趣行之前的部分,一个用于行本身。在替换模式中,你可以使用$1来在结果中使用第一个捕获组的值。

示例:

  1. package main
  2. import "fmt"
  3. import "regexp"
  4. func main() {
  5. str := `
  6. * * *
  7. Match this line
  8. `
  9. r, _ := regexp.Compile(`(\* \* \*\n.*\n)(.*)`)
  10. str = string(r.ReplaceAll([]byte(str), []byte("$1<hr/>")))
  11. fmt.Println(str)
  12. }
英文:

You would need to use a capturing group to grap the contents of that sentence:

  1. package main
  2. import &quot;fmt&quot;
  3. import &quot;regexp&quot;
  4. func main() {
  5. str := `
  6. * * *
  7. Match this line
  8. `
  9. r, _ := regexp.Compile(`\* \* \*\n.*\n(.*)`)
  10. fmt.Println(r.FindStringSubmatch(str)[1])
  11. }

Output:

<!-- language: none -->

  1. Match this line

Explanation:

<!-- language: none -->

  1. \* \* \* Matches the first line containing the asterisks.
  2. \n A newline.
  3. .* Second line. Can be anything (Likely the line is simply empty)
  4. \n A newline
  5. ( Start of capturing group
  6. .* The content of interest
  7. ) End of capturing group

In comments you asked how to replace the third line by &lt;hr/&gt;. In this case I would use two capturing groups - one for the part before the line of interest and one for the line itself. In the replacement pattern you can then use $1 to use the value of the first capturing group in the result.

Example:

  1. package main
  2. import &quot;fmt&quot;
  3. import &quot;regexp&quot;
  4. func main() {
  5. str := `
  6. * * *
  7. Match this line
  8. `
  9. r, _ := regexp.Compile(`(\* \* \*\n.*\n)(.*)`)
  10. str = string(r.ReplaceAll([]byte(str), []byte(&quot;$1&lt;hr/&gt;&quot;)))
  11. fmt.Println(str)
  12. }

huangapple
  • 本文由 发表于 2015年2月24日 21:49:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/28697533.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定