2012年10月11日 05:22:17go评论129阅读模式

英文:

How to improve this file reading code

问题

我目前有这段代码，它会逐行读取文件（以 \n 为分隔符）

file, _ := os.Open(filename) //稍后处理错误
defer file.Close()
buf := bufio.NewReader(file)
for line, err := buf.ReadString('\n'); err != io.EOF; line, err = buf.ReadString('\n')
{
    fmt.Println(strings.TrimRight(line, "\n"))
}

然而，我不太喜欢在 for 循环中两次写 buf.ReadString("\n")，有人有改进的建议吗？

英文:

I currently have this piece of code that will read a file line by line (delimited by a \n)

file, _ := os.Open(filename) //deal with the error later
defer file.Close()
buf := bufio.NewReader(file)
for line, err := buf.ReadString(&#39;\n&#39;); err != io.EOF; line, err = buf.ReadString(&#39;\n&#39;)
{
    fmt.Println(strings.TrimRight(line, &quot;\n&quot;))
}

However I don't feel comfortable with writing buf.ReadString("\n") twice in the for loop, does anyone have any suggestions for improvement?

答案1

得分: 4

> bufio.ReadString 读取输入直到第一个出现的分隔符，返回一个包含数据和分隔符的字符串。如果在找到分隔符之前遇到错误，ReadString会返回错误之前读取的数据和错误本身（通常是io.EOF）。只有当返回的数据不以分隔符结尾时，ReadString才会返回err != nil。

如果buf.ReadString('\n')返回的错误不是io.EOF，例如bufio.ErrBufferFull，你将会陷入无限循环。另外，如果文件不以\n结尾，你会默默地忽略最后一个\n之后的数据。

这里有一个更健壮的解决方案，只执行一次buf.ReadString('\n')。

package main
import (
    "bufio"
    "fmt"
    "io"
    "os"
    "strings"
)
func main() {
    filename := "FileName"
    file, err := os.Open(filename)
    if err != nil {
        fmt.Println(err)
        return
    }
    defer file.Close()
    buf := bufio.NewReader(file)
    for {
        line, err := buf.ReadString('\n')
        if err != nil {
            if err != io.EOF || len(line) > 0 {
                fmt.Println(err)
                return
            }
            break
        }
        fmt.Println(strings.TrimRight(line, "\n"))
    }
}

英文:

> bufio.ReadString reads until the first occurrence of delim in the input,
> returning a string containing the data up to and including the
> delimiter. If ReadString encounters an error before finding a
> delimiter, it returns the data read before the error and the error
> itself (often io.EOF). ReadString returns err != nil if and only if
> the returned data does not end in delim.

If buf.ReadString('\n') returns an error other than io.EOF, for example bufio.ErrBufferFull, you will be in an infinite loop. Also, if the file doesn't end in a '\n', you silently ignore the data after the last '\n'.

Here's a more robust solution, which executes buf.ReadString('\n') once.

package main
import (
	&quot;bufio&quot;
	&quot;fmt&quot;
	&quot;io&quot;
	&quot;os&quot;
	&quot;strings&quot;
)
func main() {
	filename := &quot;FileName&quot;
	file, err := os.Open(filename)
	if err != nil {
		fmt.Println(err)
		return
	}
	defer file.Close()
	buf := bufio.NewReader(file)
	for {
		line, err := buf.ReadString(&#39;\n&#39;)
		if err != nil {
			if err != io.EOF || len(line) &gt; 0 {
				fmt.Println(err)
				return
			}
			break
		}
		fmt.Println(strings.TrimRight(line, &quot;\n&quot;))
	}
}

答案2

得分: 1

大多数逐行读取的代码可以通过不逐行读取来改进。如果你的目标是读取文件并访问每一行，以下类似的代码通常更好。

package main
import (
    "fmt"
    "io/ioutil"
    "log"
    "strings"
)
func main() {
    b, err := ioutil.ReadFile("filename")
    if err != nil {
        log.Fatal(err)
    }
    s := string(b)                 // 将 []byte 转换为字符串
    s = strings.TrimRight(s, "\n") // 去除最后一行的换行符
    ss := strings.Split(s, "\n")   // 分割为 []string
    for _, s := range ss {
        fmt.Println(s)
    }
}

任何错误都会在一个地方处理，因此错误处理变得简化。去除最后一行的换行符可以处理可能有或没有最后一个换行符的文件，正如 Peter 所建议的。大多数文本文件相对于现在可用的内存来说都很小，因此一次性读取是适当的。

英文:

Most code that reads line by line can be improved by not reading line by line. If your goal is to read the file and access the lines, something like the following is almost always better.

package main
import (
    &quot;fmt&quot;
    &quot;io/ioutil&quot;
    &quot;log&quot;
    &quot;strings&quot;
)
func main() {
    b, err := ioutil.ReadFile(&quot;filename&quot;)
    if err != nil {
        log.Fatal(err)
    }
    s := string(b)                 // convert []byte to string
    s = strings.TrimRight(s, &quot;\n&quot;) // strip \n on last line
    ss := strings.Split(s, &quot;\n&quot;)   // split to []string
    for _, s := range ss {
        fmt.Println(s)
    }
}

Any errors come to you at a single point so error handling is simplified. Stripping a newline off the last line allows for files that may or may not have that final newline, as Peter suggested. Most text files are tiny compared to available memory these days, so reading these in one gulp is appropriate.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何改进这段文件读取代码

问题

答案1

答案2

为什么在Java中无法在空指针上调用方法？

使用App Engine开发服务器向Google APIs验证身份

Go中的依赖类型的通用类型推断

从Go中进行分页输出

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。