Go语言是否有一个不区分大小写的字符串包含(contains)函数?

huangapple go评论83阅读模式
英文:

Does Go have a case-insensitive string contains() function?

问题

我想要确定 stringB 是否是 stringA 的不区分大小写的子字符串。在查看 Go 的 strings 包时,我找到的最接近的方法是 strings.Contains(strings.ToLower(stringA), strings.ToLower(stringB))。是否有更简洁的方法我没有看到?

英文:

I would like to be able to determine whether stringB is a case-insensitive substring of stringA. Looking through Go's strings pkg, the closest I can get is strings.Contains(strings.ToLower(stringA), strings.ToLower(stringB). Is there a less wordy alternative that I'm not seeing?

答案1

得分: 14

如果你只是不喜欢冗长的代码,你可以尝试使你的代码格式更清晰,例如:

strings.Contains(
    strings.ToLower(stringA),
    strings.ToLower(stringB),
)

或者将其隐藏在你自己的utils(或其他)包中的函数中:

package utils

import "strings"

func ContainsI(a string, b string) bool {
    return strings.Contains(
        strings.ToLower(a),
        strings.ToLower(b),
    )
}
英文:

If it's just the wordiness that you dislike, you could try making your code formatting cleaner, e.g.:

strings.Contains(
    strings.ToLower(stringA),
    strings.ToLower(stringB),
)

Or hiding it in a function in your own utils (or whatever) package:

package utils

import "strings"

func ContainsI(a string, b string) bool {
    return strings.Contains(
        strings.ToLower(a),
        strings.ToLower(b),
    )
}

答案2

得分: 2

另一个选项:

package main
import "regexp"

func main() {
   b := regexp.MustCompile("(?i)we").MatchString("West East")
   println(b)
}

https://golang.org/pkg/regexp/syntax

英文:

Another option:

package main
import "regexp"

func main() {
   b := regexp.MustCompile("(?i)we").MatchString("West East")
   println(b)
}

https://golang.org/pkg/regexp/syntax

答案3

得分: 0

我在标准包中没有找到这个函数。这个怎么样?

package main

import (
	"fmt"
	"strings"
)

func strcasestr(a, b string) bool {
	d := len(a)
	if d == 0 {
		return true
	}
	xx := strings.ToLower(a[0:1]) + strings.ToUpper(a[0:1])
	for i := 0; i <= len(b)-len(a); i++ {
		i = strings.IndexAny(b, xx)
		if i == -1 || i+d > len(b) {
			break
		}
		if d == 1 {
			return true
		}
		if strings.EqualFold(a[1:], b[i+1:i+d]) {
			return true
		}
	}
	return false
}

func main() {
	examples := []struct {
		a, b string
	}{
		{"APP", "apple pie"},
		{"Read", "banana bread"},
		{"ISP", "cherry crisp"},
		{"ago", "dragonfruit tart"},
		{"INC", "elderberry wine"},
		{"M", "Feijoa jam"},
	}
	for i, e := range examples {
		fmt.Println(i, ":", e.a, " in ", e.b, "? ", strcasestr(e.a, e.b))
	}
}
英文:

I don't see one in the standard packages. How about this?

package main

import (
	&quot;fmt&quot;
	&quot;strings&quot;
)

func strcasestr(a, b string) bool {
	d := len(a)
	if d == 0 {
		return true
	}
	xx := strings.ToLower(a[0:1]) + strings.ToUpper(a[0:1])
	for i := 0; i &lt;= len(b)-len(a); i++ {
		i = strings.IndexAny(b, xx)
		if i == -1 || i+d &gt; len(b) {
			break
		}
		if d == 1 {
			return true
		}
		if strings.EqualFold(a[1:], b[i+1:i+d]) {
			return true
		}
	}
	return false
}

func main() {
	examples := []struct {
		a, b string
	}{
		{&quot;APP&quot;, &quot;apple pie&quot;},
		{&quot;Read&quot;, &quot;banana bread&quot;},
		{&quot;ISP&quot;, &quot;cherry crisp&quot;},
		{&quot;ago&quot;, &quot;dragonfruit tart&quot;},
		{&quot;INC&quot;, &quot;elderberry wine&quot;},
		{&quot;M&quot;, &quot;Feijoa jam&quot;},
	}
	for i, e := range examples {
		fmt.Println(i, &quot;:&quot;, e.a, &quot; in &quot;, e.b, &quot;? &quot;, strcasestr(e.a, e.b))
	}
}

答案4

得分: 0

通过对使用strings.Contains(strings.ToLower(s), strings.ToLower(substr))和使用regexp.MustCompile("(?i)" + regexp.QuoteMeta(substr)).MatchString(s)进行比较的基准测试,对Zombo的答案进行扩展。

代码

import (
	"regexp"
	"strings"
	"testing"
)

const checkStringLen38 = "Hello RiCHard McCliNTock. How are you?"
const checkStringLen3091 = `What is Lorem Ipsum?

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?

It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using 'Content here, content here', making it look like readable English. Many desktop publishing packages and web page editors now use Lorem Ipsum as their default model text, and a search for 'lorem ipsum' will uncover many web sites still in their infancy. Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like).

Where does it come from?

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. RiCHard McCliNTock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.

The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from "de Finibus Bonorum et Malorum" by Cicero are also reproduced in their exact original form, accompanied by English versions from the 1914 translation by H. Rackham.
Where can I get some?

There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, or randomised words which don't look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure there isn't anything embarrassing hidden in the middle of text. All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, making this the first true generator on the Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence structures, to generate Lorem Ipsum which looks reasonable. The generated Lorem Ipsum is therefore always free from repetition, injected humour, or non-characteristic words etc.`
const searchQuery = "richard mcclintock"

func BenchmarkContainsLowerLowerShort(b *testing.B) {
	for n := 0; n < b.N; n++ {
		strings.Contains(strings.ToLower(checkStringLen38), strings.ToLower(searchQuery))
	}
}

func BenchmarkContainsLowerLowerLong(b *testing.B) {
	for n := 0; n < b.N; n++ {
		strings.Contains(strings.ToLower(checkStringLen3091), strings.ToLower(searchQuery))
	}
}

func BenchmarkRegexpShort(b *testing.B) {
	for n := 0; n < b.N; n++ {
		regexp.MustCompile("(?i)" + regexp.QuoteMeta(searchQuery)).MatchString(checkStringLen38)
	}
}

func BenchmarkRegexpLong(b *testing.B) {
	for n := 0; n < b.N; n++ {
		regexp.MustCompile("(?i)" + regexp.QuoteMeta(searchQuery)).MatchString(checkStringLen3091)
	}
}

func BenchmarkRegexpShortPrebuilt(b *testing.B) {
	prebuiltRegExp := regexp.MustCompile("(?i)" + regexp.QuoteMeta(searchQuery))
	for n := 0; n < b.N; n++ {
		prebuiltRegExp.MatchString(checkStringLen38)
	}
}

func BenchmarkRegexpLongPrebuilt(b *testing.B) {
	prebuiltRegExp := regexp.MustCompile("(?i)" + regexp.QuoteMeta(searchQuery))
	for n := 0; n < b.N; n++ {
		prebuiltRegExp.MatchString(checkStringLen3091)
	}
}

结果

>go test -bench=. ./...
goos: windows
goarch: amd64
cpu: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
BenchmarkContainsLowerLowerShort-8       9147040               130.3 ns/op
BenchmarkContainsLowerLowerLong-8         158318              7594 ns/op
BenchmarkRegexpShort-8                    364604              3262 ns/op
BenchmarkRegexpLong-8                      40394             29851 ns/op
BenchmarkRegexpShortPrebuilt-8           3741936               328.8 ns/op
BenchmarkRegexpLongPrebuilt-8              44394             27264 ns/op

结果解读

当只搜索短字符串时,使用regexp.MustCompile("(?i)" + regexp.QuoteMeta(substr)).MatchString(s)从一开始构建*regexp.Regexp受益很大(一个数量级)。然而,即使在这种情况下,对于长字符串和短字符串,执行时间也大约是strings.Contains(strings.ToLower(s), strings.ToLower(substr))的三倍(我们甚至没有检查如果假设查询字符串已经是小写的话,ToLower()变体会更快多少),而且这仅在substr始终相同的约束条件下成立,否则无法选择构建正则表达式一次。

tl;dr

使用strings.Contains(strings.ToLower(s), strings.ToLower(substr))比使用regexp.MustCompile("(?i)" + regexp.QuoteMeta(substr)).MatchString(s)没有任何好处。

英文:

Expanding on Zombo's answer with a benchmark comparing the use of strings.Contains(strings.ToLower(s), strings.ToLower(substr)) to the use of regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(substr)).MatchString(s).

Code

import (
	&quot;regexp&quot;
	&quot;strings&quot;
	&quot;testing&quot;
)

const checkStringLen38 = &quot;Hello RiCHard McCliNTock. How are you?&quot;
const checkStringLen3091 = `What is Lorem Ipsum?

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
Why do we use it?

It is a long established fact that a reader will be distracted by the readable content of a page when looking at its layout. The point of using Lorem Ipsum is that it has a more-or-less normal distribution of letters, as opposed to using &#39;Content here, content here&#39;, making it look like readable English. Many desktop publishing packages and web page editors now use Lorem Ipsum as their default model text, and a search for &#39;lorem ipsum&#39; will uncover many web sites still in their infancy. Various versions have evolved over the years, sometimes by accident, sometimes on purpose (injected humour and the like).

Where does it come from?

Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. RiCHard McCliNTock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of &quot;de Finibus Bonorum et Malorum&quot; (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, &quot;Lorem ipsum dolor sit amet..&quot;, comes from a line in section 1.10.32.

The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from &quot;de Finibus Bonorum et Malorum&quot; by Cicero are also reproduced in their exact original form, accompanied by English versions from the 1914 translation by H. Rackham.
Where can I get some?

There are many variations of passages of Lorem Ipsum available, but the majority have suffered alteration in some form, by injected humour, or randomised words which don&#39;t look even slightly believable. If you are going to use a passage of Lorem Ipsum, you need to be sure there isn&#39;t anything embarrassing hidden in the middle of text. All the Lorem Ipsum generators on the Internet tend to repeat predefined chunks as necessary, making this the first true generator on the Internet. It uses a dictionary of over 200 Latin words, combined with a handful of model sentence structures, to generate Lorem Ipsum which looks reasonable. The generated Lorem Ipsum is therefore always free from repetition, injected humour, or non-characteristic words etc.`
const searchQuery = &quot;richard mcclintock&quot;

func BenchmarkContainsLowerLowerShort(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		strings.Contains(strings.ToLower(checkStringLen38), strings.ToLower(searchQuery))
	}
}

func BenchmarkContainsLowerLowerLong(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		strings.Contains(strings.ToLower(checkStringLen3091), strings.ToLower(searchQuery))
	}
}

func BenchmarkRegexpShort(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(searchQuery)).MatchString(checkStringLen38)
	}
}

func BenchmarkRegexpLong(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(searchQuery)).MatchString(checkStringLen3091)
	}
}

func BenchmarkRegexpShortPrebuilt(b *testing.B) {
	prebuiltRegExp := regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(searchQuery))
	for n := 0; n &lt; b.N; n++ {
		prebuiltRegExp.MatchString(checkStringLen38)
	}
}

func BenchmarkRegexpLongPrebuilt(b *testing.B) {
	prebuiltRegExp := regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(searchQuery))
	for n := 0; n &lt; b.N; n++ {
		prebuiltRegExp.MatchString(checkStringLen3091)
	}
}

Results

&gt;go test -bench=. ./...
goos: windows
goarch: amd64
cpu: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz
BenchmarkContainsLowerLowerShort-8       9147040               130.3 ns/op
BenchmarkContainsLowerLowerLong-8         158318              7594 ns/op
BenchmarkRegexpShort-8                    364604              3262 ns/op
BenchmarkRegexpLong-8                      40394             29851 ns/op
BenchmarkRegexpShortPrebuilt-8           3741936               328.8 ns/op
BenchmarkRegexpLongPrebuilt-8              44394             27264 ns/op

Interpretation of Results

When only searching short strings, use of regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(substr)).MatchString(s) benefits greatly (one order of magnitude) from building *regexp.Regexp only once. However, even then it takes about three times as long to execute as strings.Contains(strings.ToLower(s), strings.ToLower(substr)) for both long as well as short input strings (and we did not even check how much faster the ToLower()-variant would be if we assumed that the query string already was lower-cased) and even that is only the case under the constraint that substr is always the same, because building the regular expression only once is not an option otherwise.

tl;dr

There is nothing to be gained by using regexp.MustCompile(&quot;(?i)&quot; + regexp.QuoteMeta(substr)).MatchString(s) over strings.Contains(strings.ToLower(s), strings.ToLower(substr)).

huangapple
  • 本文由 发表于 2017年6月17日 02:20:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/44595669.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定