在Go中逐行读取文件

huangapple go评论89阅读模式
英文:

Reading a file line by line in Go

问题

我在Go中找不到file.ReadLine函数。

如何逐行读取文件?

英文:

I'm unable to find file.ReadLine function in Go.

How does one read a file line by line?

答案1

得分: 919

在Go 1.1及更高版本中,最简单的方法是使用bufio.Scanner。下面是一个简单的示例,从文件中读取行:

package main

import (
	"bufio"
	"fmt"
	"log"
	"os"
)

func main() {
	file, err := os.Open("/path/to/file.txt")
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	scanner := bufio.NewScanner(file)
    // 可选地,调整scanner的容量以处理超过64K的行,参见下一个示例
	for scanner.Scan() {
		fmt.Println(scanner.Text())
	}

	if err := scanner.Err(); err != nil {
		log.Fatal(err)
	}
}

这是从Reader逐行读取的最简洁的方法。

有一个注意事项:如果行的长度超过65536个字符,Scanner将报错。如果你知道你的行长度大于64K,请使用Buffer()方法来增加scanner的容量:

...
scanner := bufio.NewScanner(file)

const maxCapacity int = longLineLen  // 你需要的行长度
buf := make([]byte, maxCapacity)
scanner.Buffer(buf, maxCapacity)

for scanner.Scan() {
...
英文:

In Go 1.1 and newer the most simple way to do this is with a bufio.Scanner. Here is a simple example that reads lines from a file:

package main

import (
	"bufio"
	"fmt"
	"log"
	"os"
)

func main() {
	file, err := os.Open("/path/to/file.txt")
	if err != nil {
		log.Fatal(err)
	}
	defer file.Close()

	scanner := bufio.NewScanner(file)
    // optionally, resize scanner's capacity for lines over 64K, see next example
	for scanner.Scan() {
		fmt.Println(scanner.Text())
	}

	if err := scanner.Err(); err != nil {
		log.Fatal(err)
	}
}

This is the cleanest way to read from a Reader line by line.

There is one caveat: Scanner will error with lines longer than 65536 characters. If you know your line length is greater than 64K, use the Buffer() method to increase the scanner's capacity:

...
scanner := bufio.NewScanner(file)

const maxCapacity int = longLineLen  // your required line length
buf := make([]byte, maxCapacity)
scanner.Buffer(buf, maxCapacity)

for scanner.Scan() {
...

答案2

得分: 191

**注意:**在Go的早期版本中,接受的答案是正确的。查看最高投票的答案包含了更近期的惯用方法来实现这个。

在<code>bufio</code>包中有一个名为ReadLine的函数。

请注意,如果一行内容无法完全放入读取缓冲区,该函数将返回一个不完整的行。如果你想通过一次函数调用始终读取整行内容,你需要将<code>ReadLine</code>函数封装到你自己的函数中,并在一个for循环中调用<code>ReadLine</code>。

bufio.ReadString('\n')并不完全等同于ReadLine,因为ReadString无法处理文件的最后一行不以换行符结尾的情况。

英文:

NOTE: The accepted answer was correct in early versions of Go. See the highest voted answer contains the more recent idiomatic way to achieve this.

There is function ReadLine in package <code>bufio</code>.

Please note that if the line does not fit into the read buffer, the function will return an incomplete line. If you want to always read a whole line in your program by a single call to a function, you will need to encapsulate the <code>ReadLine</code> function into your own function which calls <code>ReadLine</code> in a for-loop.

bufio.ReadString(&#39;\n&#39;) isn't fully equivalent to ReadLine because ReadString is unable to handle the case when the last line of a file does not end with the newline character.

答案3

得分: 64

EDIT: 从go1.1开始,惯用的解决方案是使用bufio.Scanner

我写了一个简单的方法来从文件中轻松读取每一行。Readln(*bufio.Reader)函数从底层的bufio.Reader结构中返回一行(不包括\n)。

// Readln从输入缓冲读取器中返回一行(不包括结尾的\n)
// 如果有输入缓冲读取器的错误,则返回错误。
func Readln(r *bufio.Reader) (string, error) {
  var (
    isPrefix bool = true
    err error = nil
    line, ln []byte
  )
  for isPrefix && err == nil {
    line, isPrefix, err = r.ReadLine()
    ln = append(ln, line...)
  }
  return string(ln),err
}

您可以使用Readln从文件中读取每一行。以下代码从文件中读取每一行,并将每一行输出到stdout。

f, err := os.Open(fi)
if err != nil {
    fmt.Printf("打开文件时出错:%v\n",err)
    os.Exit(1)
}
r := bufio.NewReader(f)
s, e := Readln(r)
for e == nil {
    fmt.Println(s)
    s,e = Readln(r)
}

干杯!

英文:

EDIT: As of go1.1, the idiomatic solution is to use bufio.Scanner

I wrote up a way to easily read each line from a file. The Readln(*bufio.Reader) function returns a line (sans \n) from the underlying bufio.Reader struct.

// Readln returns a single line (without the ending \n)
// from the input buffered reader.
// An error is returned iff there is an error with the
// buffered reader.
func Readln(r *bufio.Reader) (string, error) {
  var (isPrefix bool = true
       err error = nil
       line, ln []byte
      )
  for isPrefix &amp;&amp; err == nil {
      line, isPrefix, err = r.ReadLine()
      ln = append(ln, line...)
  }
  return string(ln),err
}

You can use Readln to read every line from a file. The following code reads every line in a file and outputs each line to stdout.

f, err := os.Open(fi)
if err != nil {
    fmt.Printf(&quot;error opening file: %v\n&quot;,err)
    os.Exit(1)
}
r := bufio.NewReader(f)
s, e := Readln(r)
for e == nil {
    fmt.Println(s)
    s,e = Readln(r)
}

Cheers!

答案4

得分: 45

有两种常见的方法可以逐行读取文件。

  1. 使用bufio.Scanner
  2. 使用bufio.Reader中的ReadString/ReadBytes/...

在我的测试用例中,对于大约250MB,大约2,500,000行的文件,bufio.Scanner(用时:0.395491384秒)比bufio.Reader.ReadString(用时:0.446867622秒)更快。

源代码:https://github.com/xpzouying/go-practice/tree/master/read_file_line_by_line

使用bufio.Scanner读取文件,

func scanFile() {
    f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
    if err != nil {
        log.Fatalf("打开文件错误:%v", err)
        return
    }
    defer f.Close()

    sc := bufio.NewScanner(f)
    for sc.Scan() {
        _ = sc.Text()  // 获取行字符串
    }
    if err := sc.Err(); err != nil {
        log.Fatalf("扫描文件错误:%v", err)
        return
    }
}

使用bufio.Reader读取文件,

func readFileLines() {
    f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
    if err != nil {
        log.Fatalf("打开文件错误:%v", err)
        return
    }
    defer f.Close()

    rd := bufio.NewReader(f)
    for {
        line, err := rd.ReadString('\n')
        if err != nil {
            if err == io.EOF {
                break
            }

            log.Fatalf("读取文件行错误:%v", err)
            return
        }
        _ = line  // 获取行字符串
    }
}
英文:

There two common way to read file line by line.

  1. Use bufio.Scanner
  2. Use ReadString/ReadBytes/... in bufio.Reader

In my testcase, ~250MB, ~2,500,000 lines, bufio.Scanner(time used: 0.395491384s) is faster than bufio.Reader.ReadString(time_used: 0.446867622s).

Source code: https://github.com/xpzouying/go-practice/tree/master/read_file_line_by_line

Read file use bufio.Scanner,

func scanFile() {
    f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
    if err != nil {
        log.Fatalf(&quot;open file error: %v&quot;, err)
        return
    }
    defer f.Close()

    sc := bufio.NewScanner(f)
    for sc.Scan() {
        _ = sc.Text()  // GET the line string
    }
    if err := sc.Err(); err != nil {
        log.Fatalf(&quot;scan file error: %v&quot;, err)
        return
    }
}

Read file use bufio.Reader,

func readFileLines() {
    f, err := os.OpenFile(logfile, os.O_RDONLY, os.ModePerm)
    if err != nil {
        log.Fatalf(&quot;open file error: %v&quot;, err)
        return
    }
    defer f.Close()

    rd := bufio.NewReader(f)
    for {
        line, err := rd.ReadString(&#39;\n&#39;)
        if err != nil {
            if err == io.EOF {
                break
            }

            log.Fatalf(&quot;read file line error: %v&quot;, err)
            return
        }
        _ = line  // GET the line string
    }
}

答案5

得分: 21

Example from this gist

func readLine(path string) {
  inFile, err := os.Open(path)
  if err != nil {
     fmt.Println(err.Error() + `: ` + path)
     return
  }
  defer inFile.Close()

  scanner := bufio.NewScanner(inFile)
  for scanner.Scan() {
    fmt.Println(scanner.Text()) // the line
  }
}

but this gives an error when there is a line that larger than Scanner's buffer.

When that happened, what I do is use reader := bufio.NewReader(inFile) create and concat my own buffer either using ch, err := reader.ReadByte() or len, err := reader.Read(myBuffer)

Another way that I use (replace os.Stdin with file like above), this one concats when lines are long (isPrefix) and ignores empty lines:


func readLines() []string {
  r := bufio.NewReader(os.Stdin)
  bytes := []byte{}
  lines := []string{}
  for {
    line, isPrefix, err := r.ReadLine()
    if err != nil {
      break
    }
    bytes = append(bytes, line...)
    if !isPrefix {
      str := strings.TrimSpace(string(bytes))
      if len(str) > 0 {
        lines = append(lines, str)
        bytes = []byte{}
      }
    }
  }
  if len(bytes) > 0 {
    lines = append(lines, string(bytes))
  }
  return lines
}
英文:

Example from this gist

func readLine(path string) {
  inFile, err := os.Open(path)
  if err != nil {
     fmt.Println(err.Error() + `: ` + path)
     return
  }
  defer inFile.Close()

  scanner := bufio.NewScanner(inFile)
  for scanner.Scan() {
    fmt.Println(scanner.Text()) // the line
  }
}

but this gives an error when there is a line that larger than Scanner's buffer.

When that happened, what I do is use reader := bufio.NewReader(inFile) create and concat my own buffer either using ch, err := reader.ReadByte() or len, err := reader.Read(myBuffer)

Another way that I use (replace os.Stdin with file like above), this one concats when lines are long (isPrefix) and ignores empty lines:


func readLines() []string {
  r := bufio.NewReader(os.Stdin)
  bytes := []byte{}
  lines := []string{}
  for {
    line, isPrefix, err := r.ReadLine()
    if err != nil {
      break
    }
    bytes = append(bytes, line...)
    if !isPrefix {
      str := strings.TrimSpace(string(bytes))
      if len(str) &gt; 0 {
        lines = append(lines, str)
        bytes = []byte{}
      }
    }
  }
  if len(bytes) &gt; 0 {
    lines = append(lines, string(bytes))
  }
  return lines
}

答案6

得分: 13

你也可以使用ReadString和\n作为分隔符:

  f, err := os.Open(filename)
  if err != nil {
    fmt.Println("打开文件错误", err)
    os.Exit(1)
  }
  defer f.Close()
  r := bufio.NewReader(f)
  for {
    path, err := r.ReadString(10) // 0x0A分隔符 = 换行符
    if err == io.EOF {
      // 在这里做一些操作
      break
    } else if err != nil {
      return err // 如果返回错误
    }
  }
英文:

You can also use ReadString with \n as a separator:

  f, err := os.Open(filename)
  if err != nil {
    fmt.Println(&quot;error opening file &quot;, err)
    os.Exit(1)
  }
  defer f.Close()
  r := bufio.NewReader(f)
  for {
    path, err := r.ReadString(10) // 0x0A separator = newline
    if err == io.EOF {
      // do something here
      break
    } else if err != nil {
      return err // if you return error
    }
  }

答案7

得分: 7

另一种方法是使用io/ioutilstrings库来读取整个文件的字节,将其转换为字符串,并使用“\n”(换行符)作为分隔符进行拆分,例如:

import (
    "io/ioutil"
    "strings"
)

func main() {
    bytesRead, _ := ioutil.ReadFile("something.txt")
    fileContent := string(bytesRead)
    lines := strings.Split(fileContent, "\n")
}

从技术上讲,你并不是逐行读取文件,但是你可以使用这种技术解析每一行。这种方法适用于较小的文件。如果你要解析一个大文件,请使用逐行读取的技术之一。

英文:

Another method is to use the io/ioutil and strings libraries to read the entire file's bytes, convert them into a string and split them using a "\n" (newline) character as the delimiter, for example:

import (
    &quot;io/ioutil&quot;
    &quot;strings&quot;
)

func main() {
	bytesRead, _ := ioutil.ReadFile(&quot;something.txt&quot;)
	fileContent := string(bytesRead)
	lines := strings.Split(fileContent, &quot;\n&quot;)
}

Technically you're not reading the file line-by-line, however you are able to parse each line using this technique. This method is applicable to smaller files. If you're attempting to parse a massive file use one of the techniques that reads line-by-line.

答案8

得分: 6

bufio.Reader.ReadLine()工作得很好。但是如果你想通过字符串读取每一行,请尝试使用ReadString('\n')。它不需要重新发明轮子。

英文:

bufio.Reader.ReadLine() works well. But if you want to read each line by a string, try to use ReadString('\n'). It doesn't need to reinvent the wheel.

答案9

得分: 4

// 去除 '\n' 或读取直到文件结束,如果读取错误则返回错误
func readline(reader io.Reader) (line []byte, err error) {
line = make([]byte, 0, 100)
for {
b := make([]byte, 1)
n, er := reader.Read(b)
if n > 0 {
c := b[0]
if c == '\n' { // 行结束
break
}
line = append(line, c)
}
if er != nil {
err = er
return
}
}
return
}

英文:
// strip &#39;\n&#39; or read until EOF, return error if read error  
func readline(reader io.Reader) (line []byte, err error) {   
	line = make([]byte, 0, 100)                              
	for {                                                    
		b := make([]byte, 1)                                 
		n, er := reader.Read(b)                              
		if n &gt; 0 {                                           
			c := b[0]                                        
			if c == &#39;\n&#39; { // end of line                    
				break                                        
			}                                                
			line = append(line, c)                           
		}                                                    
		if er != nil {                                       
			err = er                                         
			return                                           
		}                                                    
	}                                                        
	return                                                   
}                                    

答案10

得分: 2

在下面的代码中,我使用Readline从CLI读取兴趣,直到用户按下回车键:

interests := make([]string, 1)
r := bufio.NewReader(os.Stdin)
for true {
	fmt.Print("给我一个兴趣:")
	t, _, _ := r.ReadLine()
	interests = append(interests, string(t))
	if len(t) == 0 {
		break;
	}
}
fmt.Println(interests)
英文:

In the code bellow, I read the interests from the CLI until the user hits enter and I'm using Readline:

interests := make([]string, 1)
r := bufio.NewReader(os.Stdin)
for true {
	fmt.Print(&quot;Give me an interest:&quot;)
	t, _, _ := r.ReadLine()
	interests = append(interests, string(t))
	if len(t) == 0 {
		break;
	}
}
fmt.Println(interests)

答案11

得分: 1

import (
"bufio"
"os"
)

var (
reader = bufio.NewReader(os.Stdin)
)

func ReadFromStdin() string{
result, _ := reader.ReadString('\n')
witl := result[:len(result)-1]
return witl
}

Here is an example with function ReadFromStdin() it's like fmt.Scan(&name) but its takes all strings with blank spaces like: "Hello My Name Is ..."

var name string = ReadFromStdin()

println(name)

英文:
import (
     &quot;bufio&quot;
     &quot;os&quot;
)

var (
    reader = bufio.NewReader(os.Stdin)
)

func ReadFromStdin() string{
	result, _ := reader.ReadString(&#39;\n&#39;)
	witl := result[:len(result)-1]
	return witl
}

Here is an example with function ReadFromStdin() it's like fmt.Scan(&amp;name) but its takes all strings with blank spaces like: "Hello My Name Is ..."

var name string = ReadFromStdin()

println(name)

答案12

得分: 0

The Scan* functions are of great user here. Here is a slightly modified version of word scanner example from go-lang docs to scan lines from a file.

package main

import (
	"bufio"
	"fmt"
	"os"
	"strings"
)

func main() {
	// An artificial input source.
	const input = "Now is the winter of our discontent,\nMade glorious summer by this sun of York.\n"
	scanner := bufio.NewScanner(strings.NewReader(input))
	// Set the split function for the scanning operation.
	scanner.Split(bufio.ScanLines)
	// Count the lines.
	count := 0
	for scanner.Scan() {
		fmt.Println(scanner.Text())
		count++
	}
	if err := scanner.Err(); err != nil {
		fmt.Fprintln(os.Stderr, "reading input:", err)
	}
	fmt.Printf("%d\n", count)
}
英文:

The Scan* functions are of great user here. Here is a slightly modified version of word scanner example from go-lang docs to scan lines from a file.

package main

import (
	&quot;bufio&quot;
	&quot;fmt&quot;
	&quot;os&quot;
	&quot;strings&quot;
)

func main() {
	// An artificial input source.
	const input = &quot;Now is the winter of our discontent,\nMade glorious summer by this sun of York.\n&quot;
	scanner := bufio.NewScanner(strings.NewReader(input))
	// Set the split function for the scanning operation.
	scanner.Split(bufio.ScanLines)
	// Count the lines.
	count := 0
	for scanner.Scan() {
		fmt.Println(scanner.Text())
		count++
	}
	if err := scanner.Err(); err != nil {
		fmt.Fprintln(os.Stderr, &quot;reading input:&quot;, err)
	}
	fmt.Printf(&quot;%d\n&quot;, count)
}

答案13

得分: -2

在Go 1.16的新版本中,我们可以使用embed包来读取文件内容,如下所示。

package main

import _ "embed"

func main() {
    //go:embed "hello.txt"
    var s string
    print(s)

    //go:embed "hello.txt"
    var b []byte
    print(string(b))

    //go:embed hello.txt
    var f embed.FS
    data, _ := f.ReadFile("hello.txt")
    print(string(data))
}

更多详细信息请参阅https://tip.golang.org/pkg/embed/

https://golangtutorial.dev/tips/embed-files-in-go/

英文:

In the new version of Go 1.16 we can use package embed to read the file contents as shown below.

package main

import _&quot;embed&quot;


func main() {
	//go:embed &quot;hello.txt&quot;
	var s string
	print(s)

    //go:embed &quot;hello.txt&quot;
    var b []byte
    print(string(b))

    //go:embed hello.txt
    var f embed.FS
    data, _ := f.ReadFile(&quot;hello.txt&quot;)
    print(string(data))
}

For more details go through https://tip.golang.org/pkg/embed/
And
https://golangtutorial.dev/tips/embed-files-in-go/

huangapple
  • 本文由 发表于 2012年1月6日 19:50:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/8757389.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定