在Go中如何按空格拆分字符串?

huangapple go评论74阅读模式
英文:

Split a string on whitespace in Go?

问题

给定一个输入字符串,例如" word1 word2 word3 word4 ",在Go中将其拆分为字符串数组的最佳方法是什么?请注意,每个单词之间可以有任意数量的空格或Unicode空格字符。

在Java中,我只需使用someString.trim().split("\\s+")

(注意:可能是重复的https://stackoverflow.com/questions/4466091/split-string-using-regular-expression-in-go没有给出任何好的答案。请提供一个实际的示例,而不仅仅是对regexpstrings包的引用链接。)

英文:

Given an input string such as " word1 word2 word3 word4 ", what would be the best approach to split this as an array of strings in Go? Note that there can be any number of spaces or unicode-spacing characters between each word.

In Java I would just use someString.trim().split("\\s+").

(Note: possible duplicate https://stackoverflow.com/questions/4466091/split-string-using-regular-expression-in-go doesn't give any good quality answer. Please provide an actual example, not just a link to the regexp or strings packages reference.)

答案1

得分: 380

strings包有一个Fields方法。

someString := "one    two   three four "

words := strings.Fields(someString)

fmt.Println(words, len(words)) // [one two three four] 4

DEMO: http://play.golang.org/p/et97S90cIH

从文档中可以看到:

> Fields方法根据unicode.IsSpace定义的一个或多个连续的空白字符将字符串s分割成子字符串的切片,如果s只包含空白字符,则返回一个空切片。

英文:

The strings package has a Fields method.

someString := "one    two   three four "

words := strings.Fields(someString)

fmt.Println(words, len(words)) // [one two three four] 4

DEMO: http://play.golang.org/p/et97S90cIH

From the docs:

> Fields splits the string s around each instance of one or more consecutive white space characters, as defined by unicode.IsSpace, returning a slice of substrings of s or an empty slice if s contains only white space.

答案2

得分: 12

如果您正在使用tip: regexp.Split

func (re *Regexp) Split(s string, n int) []string

Split函数将字符串s切割成由表达式分隔的子字符串,并返回这些表达式匹配之间的子字符串的切片。

此方法返回的切片由s中不包含在FindAllString返回的切片中的所有子字符串组成。当应用于不包含元字符的表达式时,它等效于strings.SplitN。

示例:

s := regexp.MustCompile("a*").Split("abaabaccadaaae", 5)
// s: ["", "b", "b", "c", "cadaaae"]

count参数确定要返回的子字符串的数量:

n > 0: 最多返回n个子字符串;最后一个子字符串将是未切割的剩余部分。
n == 0: 结果为nil(零个子字符串)
n < 0: 所有子字符串
英文:

If you're using tip: regexp.Split

func (re *Regexp) Split(s string, n int) []string

Split slices s into substrings separated by the expression and returns
a slice of the substrings between those expression matches.

The slice returned by this method consists of all the substrings
of s not contained in the slice returned by FindAllString. When called
on an expression that contains no metacharacters, it is equivalent to strings.SplitN.

Example:

s := regexp.MustCompile(&quot;a*&quot;).Split(&quot;abaabaccadaaae&quot;, 5)
// s: [&quot;&quot;, &quot;b&quot;, &quot;b&quot;, &quot;c&quot;, &quot;cadaaae&quot;]

The count determines the number of substrings to return:

n &gt; 0: at most n substrings; the last substring will be the unsplit remainder.
n == 0: the result is nil (zero substrings)
n &lt; 0: all substrings

答案3

得分: 7

我想到了以下的解决方案,但是看起来有点冗长:

import "regexp"
r := regexp.MustCompile("[^\\s]+")
r.FindAllString("  word1   word2 word3   word4  ", -1)

这将返回:

[]string{"word1", "word2", "word3", "word4"}

是否有更简洁或更符合惯用表达的方式?

英文:

I came up with the following, but that seems a bit too verbose:

import &quot;regexp&quot;
r := regexp.MustCompile(&quot;[^\\s]+&quot;)
r.FindAllString(&quot;  word1   word2 word3   word4  &quot;, -1)

which will evaluate to:

[]string{&quot;word1&quot;, &quot;word2&quot;, &quot;word3&quot;, &quot;word4&quot;}

Is there a more compact or more idiomatic expression?

答案4

得分: 3

你可以使用strings包的split函数
strings.Split(某个字符串, " ")

strings.Split

英文:

You can use package strings function split
strings.Split(someString, " ")

strings.Split

huangapple
  • 本文由 发表于 2012年12月6日 13:53:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/13737745.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定