在Go中逐行读取文件

huangapple go评论121阅读模式
英文:

Reading a file line by line in Go

问题

我在Go中找不到file.ReadLine函数。

如何逐行读取文件?

英文:

I'm unable to find file.ReadLine function in Go.

How does one read a file line by line?

答案1

得分: 919

在Go 1.1及更高版本中,最简单的方法是使用bufio.Scanner。下面是一个简单的示例,从文件中读取行:

  1. package main
  2. import (
  3. "bufio"
  4. "fmt"
  5. "log"
  6. "os"
  7. )
  8. func main() {
  9. file, err := os.Open("/path/to/file.txt")
  10. if err != nil {
  11. log.Fatal(err)
  12. }
  13. defer file.Close()
  14. scanner := bufio.NewScanner(file)
  15. // 可选地,调整scanner的容量以处理超过64K的行,参见下一个示例
  16. for scanner.Scan() {
  17. fmt.Println(scanner.Text())
  18. }
  19. if err := scanner.Err(); err != nil {
  20. log.Fatal(err)
  21. }
  22. }

这是从Reader逐行读取的最简洁的方法。

有一个注意事项:如果行的长度超过65536个字符,Scanner将报错。如果你知道你的行长度大于64K,请使用Buffer()方法来增加scanner的容量:

  1. ...
  2. scanner := bufio.NewScanner(file)
  3. const maxCapacity int = longLineLen // 你需要的行长度
  4. buf := make([]byte, maxCapacity)
  5. scanner.Buffer(buf, maxCapacity)
  6. for scanner.Scan() {
  7. ...
英文:

In Go 1.1 and newer the most simple way to do this is with a bufio.Scanner. Here is a simple example that reads lines from a file:

  1. package main
  2. import (
  3. "bufio"
  4. "fmt"
  5. "log"
  6. "os"
  7. )
  8. func main() {
  9. file, err := os.Open("/path/to/file.txt")
  10. if err != nil {
  11. log.Fatal(err)
  12. }
  13. defer file.Close()
  14. scanner := bufio.NewScanner(file)
  15. // optionally, resize scanner's capacity for lines over 64K, see next example
  16. for scanner.Scan() {
  17. fmt.Println(scanner.Text())
  18. }
  19. if err := scanner.Err(); err != nil {
  20. log.Fatal(err)
  21. }
  22. }

This is the cleanest way to read from a Reader line by line.

There is one caveat: Scanner will error with lines longer than 65536 characters. If you know your line length is greater than 64K, use the Buffer() method to increase the scanner's capacity:

  1. ...
  2. scanner := bufio.NewScanner(file)
  3. const maxCapacity int = longLineLen // your required line length
  4. buf := make([]byte, maxCapacity)
  5. scanner.Buffer(buf, maxCapacity)
  6. for scanner.Scan() {
  7. ...

答案2

得分: 191

**注意:**在Go的早期版本中,接受的答案是正确的。查看最高投票的答案包含了更近期的惯用方法来实现这个。

在<code>bufio</code>包中有一个名为ReadLine的函数。

请注意,如果一行内容无法完全放入读取缓冲区,该函数将返回一个不完整的行。如果你想通过一次函数调用始终读取整行内容,你需要将<code>ReadLine</code>函数封装到你自己的函数中,并在一个for循环中调用<code>ReadLine</code>。

bufio.ReadString('\n')并不完全等同于ReadLine,因为ReadString无法处理文件的最后一行不以换行符结尾的情况。

英文:

NOTE: The accepted answer was correct in early versions of Go. See the highest voted answer contains the more recent idiomatic way to achieve this.

There is function ReadLine in package <code>bufio</code>.

Please note that if the line does not fit into the read buffer, the function will return an incomplete line. If you want to always read a whole line in your program by a single call to a function, you will need to encapsulate the <code>ReadLine</code> function into your own function which calls <code>ReadLine</code> in a for-loop.

bufio.ReadString(&#39;\n&#39;) isn't fully equivalent to ReadLine because ReadString is unable to handle the case when the last line of a file does not end with the newline character.

答案3

得分: 64

EDIT: 从go1.1开始,惯用的解决方案是使用bufio.Scanner

我写了一个简单的方法来从文件中轻松读取每一行。Readln(*bufio.Reader)函数从底层的bufio.Reader结构中返回一行(不包括\n)。

  1. // Readln从输入缓冲读取器中返回一行(不包括结尾的\n)
  2. // 如果有输入缓冲读取器的错误,则返回错误。
  3. func Readln(r *bufio.Reader) (string, error) {
  4. var (
  5. isPrefix bool = true
  6. err error = nil
  7. line, ln []byte
  8. )
  9. for isPrefix && err == nil {
  10. line, isPrefix, err = r.ReadLine()
  11. ln = append(ln, line...)
  12. }
  13. return string(ln),err
  14. }

您可以使用Readln从文件中读取每一行。以下代码从文件中读取每一行,并将每一行输出到stdout。

  1. f, err := os.Open(fi)
  2. if err != nil {
  3. fmt.Printf("打开文件时出错:%v\n",err)
  4. os.Exit(1)
  5. }
  6. r := bufio.NewReader(f)
  7. s, e := Readln(r)
  8. for e == nil {
  9. fmt.Println(s)
  10. s,e = Readln(r)
  11. }

干杯!

英文:

EDIT: As of go1.1, the idiomatic solution is to use bufio.Scanner

I wrote up a way to easily read each line from a file. The Readln(*bufio.Reader) function returns a line (sans \n) from the underlying bufio.Reader struct.

  1. // Readln returns a single line (without the ending \n)
  2. // from the input buffered reader.
  3. // An error is returned iff there is an error with the
  4. // buffered reader.
  5. func Readln(r *bufio.Reader) (string, error) {
  6. var (isPrefix bool = true
  7. err error = nil
  8. line, ln []byte
  9. )
  10. for isPrefix &amp;&amp; err == nil {
  11. line, isPrefix, err = r.ReadLine()
  12. ln = append(ln, line...)
  13. }
  14. return string(ln),err
  15. }

You can use Readln to read every line from a file. The following code reads every line in a file and outputs each line to stdout.

  1. f, err := os.Open(fi)
  2. if err != nil {
  3. fmt.Printf(&quot;error opening file: %v\n&quot;,err)
  4. os.Exit(1)
  5. }
  6. r := bufio.NewReader(f)
  7. s, e := Readln(r)
  8. for e == nil {
  9. fmt.Println(s)
  10. s,e = Readln(r)
  11. }

Cheers!

答案4

得分: 45

有两种常见的方法可以逐行读取文件。

  1. 使用bufio.Scanner
  2. 使用bufio.Reader中的ReadString/ReadBytes/...

在我的测试用例中,对于大约250MB,大约2,500,000行的文件,bufio.Scanner(用时:0.395491384秒)比bufio.Reader.ReadString(用时:0.446867622秒)更快。

源代码:https://github.com/xpzouying/go-practice/tree/master/read_file_line_by_line

使用bufio.Scanner读取文件,

  1. func scanFile() {
  2. f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
  3. if err != nil {
  4. log.Fatalf("打开文件错误:%v", err)
  5. return
  6. }
  7. defer f.Close()
  8. sc := bufio.NewScanner(f)
  9. for sc.Scan() {
  10. _ = sc.Text() // 获取行字符串
  11. }
  12. if err := sc.Err(); err != nil {
  13. log.Fatalf("扫描文件错误:%v", err)
  14. return
  15. }
  16. }

使用bufio.Reader读取文件,

  1. func readFileLines() {
  2. f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
  3. if err != nil {
  4. log.Fatalf("打开文件错误:%v", err)
  5. return
  6. }
  7. defer f.Close()
  8. rd := bufio.NewReader(f)
  9. for {
  10. line, err := rd.ReadString('\n')
  11. if err != nil {
  12. if err == io.EOF {
  13. break
  14. }
  15. log.Fatalf("读取文件行错误:%v", err)
  16. return
  17. }
  18. _ = line // 获取行字符串
  19. }
  20. }
英文:

There two common way to read file line by line.

  1. Use bufio.Scanner
  2. Use ReadString/ReadBytes/... in bufio.Reader

In my testcase, ~250MB, ~2,500,000 lines, bufio.Scanner(time used: 0.395491384s) is faster than bufio.Reader.ReadString(time_used: 0.446867622s).

Source code: https://github.com/xpzouying/go-practice/tree/master/read_file_line_by_line

Read file use bufio.Scanner,

  1. func scanFile() {
  2. f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
  3. if err != nil {
  4. log.Fatalf(&quot;open file error: %v&quot;, err)
  5. return
  6. }
  7. defer f.Close()
  8. sc := bufio.NewScanner(f)
  9. for sc.Scan() {
  10. _ = sc.Text() // GET the line string
  11. }
  12. if err := sc.Err(); err != nil {
  13. log.Fatalf(&quot;scan file error: %v&quot;, err)
  14. return
  15. }
  16. }

Read file use bufio.Reader,

  1. func readFileLines() {
  2. f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
  3. if err != nil {
  4. log.Fatalf(&quot;open file error: %v&quot;, err)
  5. return
  6. }
  7. defer f.Close()
  8. rd := bufio.NewReader(f)
  9. for {
  10. line, err := rd.ReadString(&#39;\n&#39;)
  11. if err != nil {
  12. if err == io.EOF {
  13. break
  14. }
  15. log.Fatalf(&quot;read file line error: %v&quot;, err)
  16. return
  17. }
  18. _ = line // GET the line string
  19. }
  20. }

答案5

得分: 21

Example from this gist

  1. func readLine(path string) {
  2. inFile, err := os.Open(path)
  3. if err != nil {
  4. fmt.Println(err.Error() + `: ` + path)
  5. return
  6. }
  7. defer inFile.Close()
  8. scanner := bufio.NewScanner(inFile)
  9. for scanner.Scan() {
  10. fmt.Println(scanner.Text()) // the line
  11. }
  12. }

but this gives an error when there is a line that larger than Scanner's buffer.

When that happened, what I do is use reader := bufio.NewReader(inFile) create and concat my own buffer either using ch, err := reader.ReadByte() or len, err := reader.Read(myBuffer)

Another way that I use (replace os.Stdin with file like above), this one concats when lines are long (isPrefix) and ignores empty lines:

  1. func readLines() []string {
  2. r := bufio.NewReader(os.Stdin)
  3. bytes := []byte{}
  4. lines := []string{}
  5. for {
  6. line, isPrefix, err := r.ReadLine()
  7. if err != nil {
  8. break
  9. }
  10. bytes = append(bytes, line...)
  11. if !isPrefix {
  12. str := strings.TrimSpace(string(bytes))
  13. if len(str) > 0 {
  14. lines = append(lines, str)
  15. bytes = []byte{}
  16. }
  17. }
  18. }
  19. if len(bytes) > 0 {
  20. lines = append(lines, string(bytes))
  21. }
  22. return lines
  23. }
英文:

Example from this gist

  1. func readLine(path string) {
  2. inFile, err := os.Open(path)
  3. if err != nil {
  4. fmt.Println(err.Error() + `: ` + path)
  5. return
  6. }
  7. defer inFile.Close()
  8. scanner := bufio.NewScanner(inFile)
  9. for scanner.Scan() {
  10. fmt.Println(scanner.Text()) // the line
  11. }
  12. }

but this gives an error when there is a line that larger than Scanner's buffer.

When that happened, what I do is use reader := bufio.NewReader(inFile) create and concat my own buffer either using ch, err := reader.ReadByte() or len, err := reader.Read(myBuffer)

Another way that I use (replace os.Stdin with file like above), this one concats when lines are long (isPrefix) and ignores empty lines:

  1. func readLines() []string {
  2. r := bufio.NewReader(os.Stdin)
  3. bytes := []byte{}
  4. lines := []string{}
  5. for {
  6. line, isPrefix, err := r.ReadLine()
  7. if err != nil {
  8. break
  9. }
  10. bytes = append(bytes, line...)
  11. if !isPrefix {
  12. str := strings.TrimSpace(string(bytes))
  13. if len(str) &gt; 0 {
  14. lines = append(lines, str)
  15. bytes = []byte{}
  16. }
  17. }
  18. }
  19. if len(bytes) &gt; 0 {
  20. lines = append(lines, string(bytes))
  21. }
  22. return lines
  23. }

答案6

得分: 13

你也可以使用ReadString和\n作为分隔符:

  1. f, err := os.Open(filename)
  2. if err != nil {
  3. fmt.Println("打开文件错误", err)
  4. os.Exit(1)
  5. }
  6. defer f.Close()
  7. r := bufio.NewReader(f)
  8. for {
  9. path, err := r.ReadString(10) // 0x0A分隔符 = 换行符
  10. if err == io.EOF {
  11. // 在这里做一些操作
  12. break
  13. } else if err != nil {
  14. return err // 如果返回错误
  15. }
  16. }
英文:

You can also use ReadString with \n as a separator:

  1. f, err := os.Open(filename)
  2. if err != nil {
  3. fmt.Println(&quot;error opening file &quot;, err)
  4. os.Exit(1)
  5. }
  6. defer f.Close()
  7. r := bufio.NewReader(f)
  8. for {
  9. path, err := r.ReadString(10) // 0x0A separator = newline
  10. if err == io.EOF {
  11. // do something here
  12. break
  13. } else if err != nil {
  14. return err // if you return error
  15. }
  16. }

答案7

得分: 7

另一种方法是使用io/ioutilstrings库来读取整个文件的字节,将其转换为字符串,并使用“\n”(换行符)作为分隔符进行拆分,例如:

  1. import (
  2. "io/ioutil"
  3. "strings"
  4. )
  5. func main() {
  6. bytesRead, _ := ioutil.ReadFile("something.txt")
  7. fileContent := string(bytesRead)
  8. lines := strings.Split(fileContent, "\n")
  9. }

从技术上讲,你并不是逐行读取文件,但是你可以使用这种技术解析每一行。这种方法适用于较小的文件。如果你要解析一个大文件,请使用逐行读取的技术之一。

英文:

Another method is to use the io/ioutil and strings libraries to read the entire file's bytes, convert them into a string and split them using a "\n" (newline) character as the delimiter, for example:

  1. import (
  2. &quot;io/ioutil&quot;
  3. &quot;strings&quot;
  4. )
  5. func main() {
  6. bytesRead, _ := ioutil.ReadFile(&quot;something.txt&quot;)
  7. fileContent := string(bytesRead)
  8. lines := strings.Split(fileContent, &quot;\n&quot;)
  9. }

Technically you're not reading the file line-by-line, however you are able to parse each line using this technique. This method is applicable to smaller files. If you're attempting to parse a massive file use one of the techniques that reads line-by-line.

答案8

得分: 6

bufio.Reader.ReadLine()工作得很好。但是如果你想通过字符串读取每一行,请尝试使用ReadString('\n')。它不需要重新发明轮子。

英文:

bufio.Reader.ReadLine() works well. But if you want to read each line by a string, try to use ReadString('\n'). It doesn't need to reinvent the wheel.

答案9

得分: 4

// 去除 '\n' 或读取直到文件结束,如果读取错误则返回错误
func readline(reader io.Reader) (line []byte, err error) {
line = make([]byte, 0, 100)
for {
b := make([]byte, 1)
n, er := reader.Read(b)
if n > 0 {
c := b[0]
if c == '\n' { // 行结束
break
}
line = append(line, c)
}
if er != nil {
err = er
return
}
}
return
}

英文:
  1. // strip &#39;\n&#39; or read until EOF, return error if read error
  2. func readline(reader io.Reader) (line []byte, err error) {
  3. line = make([]byte, 0, 100)
  4. for {
  5. b := make([]byte, 1)
  6. n, er := reader.Read(b)
  7. if n &gt; 0 {
  8. c := b[0]
  9. if c == &#39;\n&#39; { // end of line
  10. break
  11. }
  12. line = append(line, c)
  13. }
  14. if er != nil {
  15. err = er
  16. return
  17. }
  18. }
  19. return
  20. }

答案10

得分: 2

在下面的代码中,我使用Readline从CLI读取兴趣,直到用户按下回车键:

  1. interests := make([]string, 1)
  2. r := bufio.NewReader(os.Stdin)
  3. for true {
  4. fmt.Print("给我一个兴趣:")
  5. t, _, _ := r.ReadLine()
  6. interests = append(interests, string(t))
  7. if len(t) == 0 {
  8. break;
  9. }
  10. }
  11. fmt.Println(interests)
英文:

In the code bellow, I read the interests from the CLI until the user hits enter and I'm using Readline:

  1. interests := make([]string, 1)
  2. r := bufio.NewReader(os.Stdin)
  3. for true {
  4. fmt.Print(&quot;Give me an interest:&quot;)
  5. t, _, _ := r.ReadLine()
  6. interests = append(interests, string(t))
  7. if len(t) == 0 {
  8. break;
  9. }
  10. }
  11. fmt.Println(interests)

答案11

得分: 1

import (
"bufio"
"os"
)

var (
reader = bufio.NewReader(os.Stdin)
)

func ReadFromStdin() string{
result, _ := reader.ReadString('\n')
witl := result[:len(result)-1]
return witl
}

Here is an example with function ReadFromStdin() it's like fmt.Scan(&name) but its takes all strings with blank spaces like: "Hello My Name Is ..."

var name string = ReadFromStdin()

println(name)

英文:
  1. import (
  2. &quot;bufio&quot;
  3. &quot;os&quot;
  4. )
  5. var (
  6. reader = bufio.NewReader(os.Stdin)
  7. )
  8. func ReadFromStdin() string{
  9. result, _ := reader.ReadString(&#39;\n&#39;)
  10. witl := result[:len(result)-1]
  11. return witl
  12. }

Here is an example with function ReadFromStdin() it's like fmt.Scan(&amp;name) but its takes all strings with blank spaces like: "Hello My Name Is ..."

  1. var name string = ReadFromStdin()
  2. println(name)

答案12

得分: 0

The Scan* functions are of great user here. Here is a slightly modified version of word scanner example from go-lang docs to scan lines from a file.

  1. package main
  2. import (
  3. "bufio"
  4. "fmt"
  5. "os"
  6. "strings"
  7. )
  8. func main() {
  9. // An artificial input source.
  10. const input = "Now is the winter of our discontent,\nMade glorious summer by this sun of York.\n"
  11. scanner := bufio.NewScanner(strings.NewReader(input))
  12. // Set the split function for the scanning operation.
  13. scanner.Split(bufio.ScanLines)
  14. // Count the lines.
  15. count := 0
  16. for scanner.Scan() {
  17. fmt.Println(scanner.Text())
  18. count++
  19. }
  20. if err := scanner.Err(); err != nil {
  21. fmt.Fprintln(os.Stderr, "reading input:", err)
  22. }
  23. fmt.Printf("%d\n", count)
  24. }
英文:

The Scan* functions are of great user here. Here is a slightly modified version of word scanner example from go-lang docs to scan lines from a file.

  1. package main
  2. import (
  3. &quot;bufio&quot;
  4. &quot;fmt&quot;
  5. &quot;os&quot;
  6. &quot;strings&quot;
  7. )
  8. func main() {
  9. // An artificial input source.
  10. const input = &quot;Now is the winter of our discontent,\nMade glorious summer by this sun of York.\n&quot;
  11. scanner := bufio.NewScanner(strings.NewReader(input))
  12. // Set the split function for the scanning operation.
  13. scanner.Split(bufio.ScanLines)
  14. // Count the lines.
  15. count := 0
  16. for scanner.Scan() {
  17. fmt.Println(scanner.Text())
  18. count++
  19. }
  20. if err := scanner.Err(); err != nil {
  21. fmt.Fprintln(os.Stderr, &quot;reading input:&quot;, err)
  22. }
  23. fmt.Printf(&quot;%d\n&quot;, count)
  24. }

答案13

得分: -2

在Go 1.16的新版本中,我们可以使用embed包来读取文件内容,如下所示。

  1. package main
  2. import _ "embed"
  3. func main() {
  4. //go:embed "hello.txt"
  5. var s string
  6. print(s)
  7. //go:embed "hello.txt"
  8. var b []byte
  9. print(string(b))
  10. //go:embed hello.txt
  11. var f embed.FS
  12. data, _ := f.ReadFile("hello.txt")
  13. print(string(data))
  14. }

更多详细信息请参阅https://tip.golang.org/pkg/embed/

https://golangtutorial.dev/tips/embed-files-in-go/

英文:

In the new version of Go 1.16 we can use package embed to read the file contents as shown below.

  1. package main
  2. import _&quot;embed&quot;
  3. func main() {
  4. //go:embed &quot;hello.txt&quot;
  5. var s string
  6. print(s)
  7. //go:embed &quot;hello.txt&quot;
  8. var b []byte
  9. print(string(b))
  10. //go:embed hello.txt
  11. var f embed.FS
  12. data, _ := f.ReadFile(&quot;hello.txt&quot;)
  13. print(string(data))
  14. }

For more details go through https://tip.golang.org/pkg/embed/
And
https://golangtutorial.dev/tips/embed-files-in-go/

huangapple
  • 本文由 发表于 2012年1月6日 19:50:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/8757389.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定