去除字符串中的所有空格。

huangapple go评论93阅读模式
英文:

Strip all whitespace from a string

问题

在Go语言中,从任意字符串中删除所有空白字符的最快方法是什么?

我正在使用字符串包中的两个函数进行链式操作:

response = strings.Join(strings.Fields(response),"")

有没有更好的方法来实现这个功能?

英文:

What is the fastest way to strip all whitespace from some arbitrary string in Go.

I am chaining two function from the string package:

response = strings.Join(strings.Fields(response),"")

Anyone have a better way to do this?

答案1

得分: 79

以下是一些关于从字符串中删除所有空白字符的几种不同方法的基准测试结果(数据来源):

<pre>
BenchmarkSpaceMap-8 2000 1100084 ns/op 221187 B/op 2 allocs/op
BenchmarkSpaceFieldsJoin-8 1000 2235073 ns/op 2299520 B/op 20 allocs/op
BenchmarkSpaceStringsBuilder-8 2000 932298 ns/op 122880 B/op 1 allocs/op
</pre>

  • SpaceMap:使用 strings.Map;随着遇到更多非空白字符,逐渐增加分配的空间量
  • SpaceFieldsJoin:使用 strings.Fieldsstrings.Join;生成大量中间数据
  • SpaceStringsBuilder:使用 strings.Builder;执行单次分配,但如果源字符串主要是空白字符,可能会过度分配。

<!-- -->

package main_test

import (
	&quot;strings&quot;
	&quot;unicode&quot;
	&quot;testing&quot;
)

func SpaceMap(str string) string {
	return strings.Map(func(r rune) rune {
		if unicode.IsSpace(r) {
			return -1
		}
		return r
	}, str)
}

func SpaceFieldsJoin(str string) string {
	return strings.Join(strings.Fields(str), &quot;&quot;)
}

func SpaceStringsBuilder(str string) string {
	var b strings.Builder
	b.Grow(len(str))
	for _, ch := range str {
		if !unicode.IsSpace(ch) {
			b.WriteRune(ch)
		}
	}
	return b.String()
}

func BenchmarkSpaceMap(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		SpaceMap(data)
	}
}

func BenchmarkSpaceFieldsJoin(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		SpaceFieldsJoin(data)
	}
}

func BenchmarkSpaceStringsBuilder(b *testing.B) {
    for n := 0; n &lt; b.N; n++ {
        SpaceStringsBuilder(data)
    }
}
英文:

Here is some benchmarks on a few different methods for stripping all whitespace characters from a string: (source data):

<pre>
BenchmarkSpaceMap-8 2000 1100084 ns/op 221187 B/op 2 allocs/op
BenchmarkSpaceFieldsJoin-8 1000 2235073 ns/op 2299520 B/op 20 allocs/op
BenchmarkSpaceStringsBuilder-8 2000 932298 ns/op 122880 B/op 1 allocs/op
</pre>

  • SpaceMap: uses strings.Map; gradually increases the amount of allocated space as more non-whitespace characters are encountered
  • SpaceFieldsJoin: strings.Fields and strings.Join; generates a lot of intermediate data
  • SpaceStringsBuilder uses strings.Builder; performs a single allocation, but may grossly overallocate if the source string is mainly whitespace.

<!-- -->

package main_test

import (
	&quot;strings&quot;
	&quot;unicode&quot;
	&quot;testing&quot;
)

func SpaceMap(str string) string {
	return strings.Map(func(r rune) rune {
		if unicode.IsSpace(r) {
			return -1
		}
		return r
	}, str)
}

func SpaceFieldsJoin(str string) string {
	return strings.Join(strings.Fields(str), &quot;&quot;)
}

func SpaceStringsBuilder(str string) string {
	var b strings.Builder
	b.Grow(len(str))
	for _, ch := range str {
		if !unicode.IsSpace(ch) {
			b.WriteRune(ch)
		}
	}
	return b.String()
}

func BenchmarkSpaceMap(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		SpaceMap(data)
	}
}

func BenchmarkSpaceFieldsJoin(b *testing.B) {
	for n := 0; n &lt; b.N; n++ {
		SpaceFieldsJoin(data)
	}
}

func BenchmarkSpaceStringsBuilder(b *testing.B) {
    for n := 0; n &lt; b.N; n++ {
        SpaceStringsBuilder(data)
    }
}

答案2

得分: 77

我发现最简单的方法是使用strings.ReplaceAll函数,像这样:

randomString := "  hello      this is a test"
fmt.Println(strings.ReplaceAll(randomString, " ", ""))

输出结果为:hellothisisatest

Playground

注意:这种方法并不能移除所有类型的空白字符,并且每次调用只能移除一个字符类型。

英文:

I found the simplest way would be to use strings.ReplaceAll like so:

randomString := &quot;  hello      this is a test&quot;
fmt.Println(strings.ReplaceAll(randomString, &quot; &quot;, &quot;&quot;))

&gt;hellothisisatest

Playground

Note: This does NOT remove all types of whitespace and this approach is limited to removing a single character type with each call.

答案3

得分: 3

rosettacode.org上可以找到这样的函数:

func stripChars(str, chr string) string {
    return strings.Map(func(r rune) rune {
        if strings.IndexRune(chr, r) < 0 {
            return r
        }
        return -1
    }, str)
}

所以,只需将这里的chr替换为" ",就足以完成操作并删除空格。

请注意,Unicode 还定义了其他类型的空格(如换行符、nbsp 等),如果你正在处理无法完全控制的外部数据,你可能还想要摆脱这些空格。

可以这样实现:

func stripSpaces(str string) string {
    return strings.Map(func(r rune) rune {
        if unicode.IsSpace(r) {
            // 如果字符是空格,则删除它
            return -1
        }
        // 否则保留在字符串中
        return r
    }, str)
}

然后将其应用于你的字符串。希望它能起作用,我没有测试过。

英文:

From rosettacode.org :

You can find this kind of function :

func stripChars(str, chr string) string {
    return strings.Map(func(r rune) rune {
        if strings.IndexRune(chr, r) &lt; 0 {
            return r
        }
        return -1
    }, str)
}

So, simply replacing chr by &quot; &quot; here should be enough to do the trick and remove the whitespaces.

Beware that there are other kind of whitespaces defined by unicode (like line break, nbsp, ...), and you might also want to get rid of those (especially if you're working with external data you don't really have control over)

This would be done that way:

func stripSpaces(str string) string {
    return strings.Map(func(r rune) rune {
        if unicode.IsSpace(r) {
            // if the character is a space, drop it
            return -1
        }
        // else keep it in the string
        return r
    }, str)
}

Then simply apply to your string. Hope it works, didn't test.

答案4

得分: -1

我假设这是因为问题是几年前提出的而添加的:

strings.TrimSpace(moo)

"TrimSpace返回一个字符串s的切片,其中移除了所有前导和尾随的空白字符,根据Unicode定义。"

请参阅https://pkg.go.dev/strings#TrimSpace

英文:

I assume this was added since the question was asked years ago:

strings.TrimSpace(moo)

"TrimSpace returns a slice of the string s, with all leading and trailing white space removed, as defined by Unicode."

See https://pkg.go.dev/strings#TrimSpace

huangapple
  • 本文由 发表于 2015年8月19日 04:14:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/32081808.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定