读取 CSV 文件时,转义字符丢失了。

huangapple go评论93阅读模式
英文:

escape character got lost when reading buffer with csv reader

问题

我有一个字符串 "\\"{1,2}\\"\n", 它包含了转义字符 \\n

我尝试将其转换为字节数组,并使用 csv reader 将该数组读回字符串。然而,我注意到读回的字符串中这些转义字符被消除了。代码在这里

我想知道如何在读回的字符串中保留这些转义字符。即 news=="\\"{1,2}\\"\n"

英文:

I have a string "\"{1,2}\"\n", which contains escape char \" and \n.

I tried to converted into an array of bytes, and read this array back to string using a csv reader. However, I notice that these escape chars are eliminated from the read-back string. Code is here.

package main

import (
	"fmt"
	"bytes"
	"encoding/csv"
	
)

func main() {
	var buf bytes.Buffer
	data := "\"{1,2}\"\n"
	buf.WriteString(data)
	fmt.Println(buf) //{[34 123 49 44 50 125 34 10] 0 0}
	
	input_arr := []byte{34,123,49,44,50,125,34,10}
	var newbuf bytes.Buffer
	newbuf.Write(input_arr)
	csvReader := csv.NewReader(&newbuf)
	news, err := csvReader.Read()
	
	if err != nil {
		fmt.Println(err)
	}
	fmt.Println(news[0]) // [{1,2}]

}

I wonder how can I still keep these escape chars in my read-back string. I.e. news=="\"{1,2}\"\n"

答案1

得分: 4

data是用一个解释性字符串字面量进行初始化的。序列\"表示一个双引号字符",反斜杠只出现在源代码中,而不出现在字符串值中。当解释字符串字面量时,编译器会"消除"它。

如果你希望反斜杠字符成为字符串值的一部分,可以使用原始字符串字面量:

data := `\"{1,2}\"\n`
fmt.Println(data)

这将输出:

\"{1,2}\"\n

或者,继续使用解释性字符串字面量,并在字符串值中添加\\序列来表示单个反斜杠:

data := "\\\"{1,2}\\\"\\n"
fmt.Println(data)

这将输出相同的结果。

如果你的目标是在CSV列值中添加双引号,你必须使用CSV转义(带引号的字段):也就是说,你必须将单元格放入双引号中,并使用两个双引号表示一个双引号:

data := `"""{1,2}"""
`

csvReader := csv.NewReader(strings.NewReader(data))
news, err := csvReader.Read()

if err != nil {
    fmt.Println(err)
}
fmt.Println(news[0])

这将输出(在Go Playground上尝试):

"{1,2}"

这在encoding/csv包的文档中有详细说明:

以引号字符 " 开始和结束的字段称为带引号的字段。开始和结束的引号不是字段的一部分。

源代码:

normal string,"quoted-field"

结果为字段:

{`normal string`, `quoted-field`}

在带引号的字段中,一个引号字符后跟着第二个引号字符被视为一个单引号。

"the ""word"" is true","a ""quoted-field"""

结果为

{`the "word" is true`, `a "quoted-field"`}
英文:

data is initialized with an interpreted string literal. The sequence \" denotes a single double quote character ", the backslash only appears in the source code but not in the string value. The compiler "eliminates" it when the string literal is interpreted.

If you want the backslash chars to be part of the string's value, use a raw string literal:

data := `\"{1,2}\"\n`
fmt.Println(data)

This will output:

\"{1,2}\"\n

Or alternatively keep using interpreted string literal and add a \\ sequence for a single backslash in the string value:

data := "\\\"{1,2}\\\"\\n"
fmt.Println(data)

This outputs the same.

If your goal is to add double quotes to the CSV column value, you have to use CSV escaping (quoted fields): that is, you have to put the cell into double quotes, and use 2 double quotes for a single double quote:

data := `"""{1,2}"""
`

csvReader := csv.NewReader(strings.NewReader(data))
news, err := csvReader.Read()

if err != nil {
	fmt.Println(err)
}
fmt.Println(news[0])

This will output (try it on the Go Playground):

"{1,2}"

This is documented in the package doc of encoding/csv:

> Fields which start and stop with the quote character " are called quoted-fields. The beginning and ending quote are not part of the field.
>
> The source:
>
> normal string,"quoted-field"
> results in the fields
>
> {normal string, quoted-field}
> Within a quoted-field a quote character followed by a second quote character is considered a single quote.
>
> "the ""word"" is true","a ""quoted-field"""
> results in
>
> {the "word" is true, a "quoted-field"}

huangapple
  • 本文由 发表于 2021年10月18日 21:39:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/69617143.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定