在Go语言中进行文本处理,如何将字符串转换为字节?

huangapple go评论86阅读模式
英文:

Text processing in Go - how to convert string to byte?

问题

我正在写一个小程序来给段落编号:

  1. 在每个段落前面以 [1]...,[2]... 的形式放置段落编号。
  2. 文章标题应该被排除在外。

这是我的程序:

package main

import (
	"fmt"
	"io/ioutil"
)

var s_end = [3]string{".", "!", "?"}

func main() {
	b, err := ioutil.ReadFile("i_have_a_dream.txt")
	if err != nil {
		panic(err)
	}

	p_num, s_num := 1, 1

	for _, char := range b {
		fmt.Printf("[%d]", p_num)
		p_num += 1
		if char == byte('\n') {
			fmt.Printf("\n[%d]", p_num)
			p_num += 1
		} else {
			fmt.Printf("%c", char)
		}
	}
}

我得到了这个错误:

prog.go:21: cannot convert "\n" to type byte
prog.go:21: cannot convert "\n" (type string) to type byte
prog.go:21: invalid operation: char == "\n" (mismatched types byte and string)
prog.go:25: cannot use char (type byte) as type string in argument to fmt.Printf
[process exited with non-zero status]

如何将字符串转换为字节?

处理文本的一般做法是什么?是按字节读取,还是按行读取?

更新

我通过将缓冲字节转换为字符串,使用正则表达式替换字符串来解决了这个问题(感谢 @Tomasz Kłak 提供的正则表达式帮助)。

我将代码放在这里供参考。

package main

import (
	"fmt"
	"io/ioutil"
	"regexp"
)


func main() {
	b, err := ioutil.ReadFile("i_have_a_dream.txt")
	if err != nil {
		panic(err)
	}

	s := string(b)
	r := regexp.MustCompile("(\r\n)+")
	counter := 1

	repl := func(match string) string {
		p_num := counter
		counter++
		return fmt.Sprintf("%s [%d] ", match, p_num)
	}

	fmt.Println(r.ReplaceAllStringFunc(s, repl))
}
英文:

I'm writing a small pragram to number the paragraph:

  1. put paragraph number in front of each paragraph in the form of [1]..., [2]....
  2. Article title should be excluded.

Here is my program:

package main

import (
	"fmt"
	"io/ioutil"
)

var s_end = [3]string{".", "!", "?"}

func main() {
	b, err := ioutil.ReadFile("i_have_a_dream.txt")
	if err != nil {
		panic(err)
	}

	p_num, s_num := 1, 1

	for _, char := range b {
		fmt.Printf("[%s]", p_num)
		p_num += 1
		if char == byte("\n") {
			fmt.Printf("\n[%s]", p_num)
			p_num += 1
		} else {
			fmt.Printf(char)
		}
	}
}

http://play.golang.org/p/f4S3vQbglY

I got this error:

prog.go:21: cannot convert "\n" to type byte
prog.go:21: cannot convert "\n" (type string) to type byte
prog.go:21: invalid operation: char == "\n" (mismatched types byte and string)
prog.go:25: cannot use char (type byte) as type string in argument to fmt.Printf
[process exited with non-zero status]

How to convert string to byte?

What is the general practice to process text? Read in, parse it by byte, or by line?

Update

I solved the problem by converting the buffer byte to string, replacing strings by regular expression. (Thanks to @Tomasz Kłak for the regexp help)

I put the code here for reference.

package main

import (
	"fmt"
	"io/ioutil"
	"regexp"
)


func main() {
	b, err := ioutil.ReadFile("i_have_a_dream.txt")
	if err != nil {
		panic(err)
	}

	s := string(b)
	r := regexp.MustCompile("(\r\n)+")
	counter := 1

	repl := func(match string) string {
		p_num := counter
		counter++
		return fmt.Sprintf("%s [%d] ", match, p_num)
	}

	fmt.Println(r.ReplaceAllStringFunc(s, repl))
}

答案1

得分: 9

使用"\n"会将其视为数组,使用'\n'将其视为单个字符。

英文:

Using "\n" causes it to be treated as an array, use '\n' to treat it as a single char.

答案2

得分: 0

一个string不能以有意义的方式转换为byte。请使用以下方法之一:

  • 如果你有一个字符串字面量,比如"a",考虑使用rune字面量,比如'a',它可以转换为byte
  • 如果你想从一个字符串中取出一个byte,使用索引表达式,比如myString[42]
  • 如果你想将一个string的内容解释为一个(十进制)数字,使用strconv.Atoi()strconv.ParseInt()

请注意,在Go语言中,编写能处理Unicode字符的程序是惯例。解释如何做到这一点超出了本回答的范围,但是有一些教程可以解释需要注意的事项。

英文:

A string cannot be converted into a byte in a meaningful way. Use one of the following approaches:

  • If you have a string literal like "a", consider using a rune literal like 'a' which can be converted into a byte.
  • If you want to take a byte out of a string, use an index expression like myString[42].
  • If you want to interpret the content of a string as a (decimal) number, use strconv.Atoi() or strconv.ParseInt().

Please notice that it is customary in Go to write programs that can deal with Unicode characters. Explaining how to do this would be too much for this answer, but there are tutorials out there which explain what kind of things to pay attention to.

huangapple
  • 本文由 发表于 2015年1月11日 16:34:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/27885319.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定