计算字符串中一个或多个子字符串的出现次数

huangapple go评论93阅读模式
英文:

Counting the occurrence of one or more substrings in a string

问题

我知道要计算一个子字符串的出现次数,可以使用"strings.Count(, )"。如果我想计算substring1或substring2的出现次数,有没有比写另一行strings.count()更优雅的方法?

英文:

I know that for counting the occurrence of one substring I can use "strings.Count(<string>, <substring>)". What if I want to count the number of occurrences of substring1 OR substring2? Is there a more elegant way than writing another new line with strings.count()?

答案1

得分: 17

使用正则表达式(regular expression):

https://golang.org/pkg/regexp/

aORb := regexp.MustCompile("A|B")

matches := aORb.FindAllStringIndex("A B C B A", -1)
fmt.Println(len(matches))

英文:

Use a regular expression:

https://play.golang.org/p/xMsHIYKtkQ

aORb := regexp.MustCompile(&quot;A|B&quot;)

matches := aORb.FindAllStringIndex(&quot;A B C B A&quot;, -1)
fmt.Println(len(matches))

答案2

得分: 2

另一种进行子字符串匹配的方法是使用suffixarray包。下面是一个匹配多个模式的示例:

package main

import (
	"fmt"
	"index/suffixarray"
	"regexp"
)

func main() {
	r := regexp.MustCompile("an")
	index := suffixarray.New([]byte("banana"))
	results := index.FindAllIndex(r, -1)
	fmt.Println(len(results))
}

你也可以使用Lookup函数来匹配单个子字符串。

英文:

Another way to do substring matching is with the suffixarray package. Here is an example of matching multiple patterns:

package main

import (
	&quot;fmt&quot;
	&quot;index/suffixarray&quot;
	&quot;regexp&quot;
)

func main() {
	r := regexp.MustCompile(&quot;an&quot;)
	index := suffixarray.New([]byte(&quot;banana&quot;))
	results := index.FindAllIndex(r, -1)
	fmt.Println(len(results))
}

You can also match a single substring with the Lookup function.

答案3

得分: 0

如果你想在一个大字符串中计算匹配项的数量,而不需要为了获取长度而分配所有索引的空间,然后再将它们丢弃,你可以使用Regexp.FindStringIndex在循环中匹配连续的子字符串:

func countMatches(s string, re *regexp.Regexp) int {
	total := 0
	for start := 0; start < len(s); {
		remaining := s[start:] // 切片操作是廉价的
		loc := re.FindStringIndex(remaining)
		if loc == nil {
			break
		}
		// loc[0] 是匹配的起始索引,
		// loc[1] 是匹配的结束索引(不包含)
		start += loc[1]
		total++
	}
	return total
}

func main() {
	s := "abracadabra"
	fmt.Println(countMatches(s, regexp.MustCompile(`a|b`)))
}

在 Go Playground 上运行的可执行示例

英文:

If you want to count the number of matches in a large string, without allocating space for all the indices just to get the length and then throwing them away, you can use Regexp.FindStringIndex in a loop to match against successive substrings:

func countMatches(s string, re *regexp.Regexp) int {
	total := 0
	for start := 0; start &lt; len(s); {
		remaining := s[start:] // slicing the string is cheap
		loc := re.FindStringIndex(remaining)
		if loc == nil {
			break
		}
		// loc[0] is the start index of the match,
		// loc[1] is the end index (exclusive)
		start += loc[1]
		total++
	}
	return total
}

func main() {
	s := &quot;abracadabra&quot;
	fmt.Println(countMatches(s, regexp.MustCompile(`a|b`)))
}

runnable example at Go Playground

huangapple
  • 本文由 发表于 2017年3月11日 03:16:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/42726108.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定