在Go中使用strings.Split函数

huangapple go评论99阅读模式
英文:

strings.Split in Go

问题

The file names.txt consists of many names in the form of:

"KELLEE","JOSLYN","JASON","INGER","INDIRA","GLINDA","GLENNIS"

Does anyone know how to split the string so that it is individual names separated by commas?

KELLEE,JOSLYN,JASON,INGER,INDIRA,GLINDA,GLENNIS

The following code splits by comma and leaves quotes around the name, what is the escape character to split out the "? Can it be done in one Split statement, splitting out "," and leaving a comma to separate?

package main

import "fmt"
import "io/ioutil"
import "strings"

func main() {
	    fData, err := ioutil.ReadFile("names.txt")	// read in the external file
    if err != nil {
	    fmt.Println("Err is ", err) 	// print any error
    }
    strbuffer := string(fData)	// convert read in file to a string
    
    arr := strings.Split(strbuffer, ",")

    fmt.Println(arr)
    
}

By the way, this is part of Project Euler problem # 22. http://projecteuler.net/problem=22

英文:

The file names.txt consists of many names in the form of:

"KELLEE","JOSLYN","JASON","INGER","INDIRA","GLINDA","GLENNIS"

Does anyone know how to split the string so that it is individual names separated by commas?

KELLEE,JOSLYN,JASON,INGER,INDIRA,GLINDA,GLENNIS

The following code splits by comma and leaves quotes around the name, what is the escape character to split out the ". Can it be done in one Split statement, splitting out "," and leaving a comma to separate?

package main

import "fmt"
import "io/ioutil"
import "strings"

func main() {
	    fData, err := ioutil.ReadFile("names.txt")	// read in the external file
    if err != nil {
	    fmt.Println("Err is ", err) 	// print any error
    }
    strbuffer := string(fData)	// convert read in file to a string
    
    arr := strings.Split(strbuffer, ",")

    fmt.Println(arr)
    
}

By the way, this is part of Project Euler problem # 22. <http://projecteuler.net/problem=22>

答案1

得分: 19

Jeremy的答案基本上是正确的,完全按照你的要求做到了。但是你的“names.txt”文件的格式实际上是一个众所周知的格式,称为CSV(逗号分隔值)。幸运的是,Go语言自带了一个encoding/csv包(它是标准库的一部分),可以轻松地解码和编码这种格式。除了Jeremy的解决方案之外,这个包还会在格式无效时给出准确的错误消息,支持多行记录,并正确地取消引用引号字符串。

基本用法如下:

package main

import (
	"encoding/csv"
	"fmt"
	"io"
	"os"
)

func main() {
	file, err := os.Open("names.txt")
	if err != nil {
		fmt.Println("Error:", err)
		return
	}
    defer file.Close()
	reader := csv.NewReader(file)
	for {
		record, err := reader.Read()
		if err == io.EOF {
			break
		} else if err != nil {
			fmt.Println("Error:", err)
			return
		}

		fmt.Println(record)	// record的类型是[]string
	}
}

还有一个ReadAll方法,如果整个文件都能放入内存中,可能会使你的程序更短。

更新: dystroy刚刚指出你的文件实际上只有一行。CSV读取器对此也适用,但以下这种不太通用的解决方案也足够了:

for {
    if n, _ := fmt.Fscanf(file, "%q,", &name); n != 1 {
        break
    }
    fmt.Println("name:", name)
}
英文:

Jeremy's answer is basically correct and does exactly what you have asked for. But the format of your "names.txt" file is actually a well known and is called CSV (comma separated values). Luckily, Go comes with an encoding/csv package (which is part of the standard library) for decoding and encoding such formats easily. In addition to your + Jeremy's solution, this package will also give exact error messages if the format is invalid, supports multi-line records and does proper unquoting of quoted strings.

The basic usage looks like this:

package main

import (
	&quot;encoding/csv&quot;
	&quot;fmt&quot;
	&quot;io&quot;
	&quot;os&quot;
)

func main() {
	file, err := os.Open(&quot;names.txt&quot;)
	if err != nil {
		fmt.Println(&quot;Error:&quot;, err)
		return
	}
    defer file.Close()
	reader := csv.NewReader(file)
	for {
		record, err := reader.Read()
		if err == io.EOF {
			break
		} else if err != nil {
			fmt.Println(&quot;Error:&quot;, err)
			return
		}

		fmt.Println(record)	// record has the type []string
	}
}

There is also a ReadAll method that might make your program even shorter, assuming that the whole file fits into the memory.

Update: dystroy has just pointed out that your file has only one line anyway. The CSV reader works well for that too, but the following, less general solution should also be sufficient:

for {
    if n, _ := fmt.Fscanf(file, &quot;%q,&quot;, &amp;name); n != 1 {
        break
    }
    fmt.Println(&quot;name:&quot;, name)
}

答案2

得分: 9

Split不会从子字符串中删除字符。你的分割是正确的,你只需要在之后使用strings.Trim(val, """)处理切片。

for i, val := range arr {
  arr[i] = strings.Trim(val, "\"")
}

现在arr将不包含前导和尾随的引号。

英文:

Split doesn't remove characters from the substrings. Your split is fine you just need to process the slice afterwards with strings.Trim(val, "&quot;").

for i, val := range arr {
  arr[i] = strings.Trim(val, &quot;\&quot;&quot;)
}

Now arr will have the leading and trailing "s removed.

huangapple
  • 本文由 发表于 2012年7月13日 09:19:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/11462879.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定