英文:
How to improve this file reading code
问题
我目前有这段代码,它会逐行读取文件(以 \n 为分隔符)
file, _ := os.Open(filename) //稍后处理错误
defer file.Close()
buf := bufio.NewReader(file)
for line, err := buf.ReadString('\n'); err != io.EOF; line, err = buf.ReadString('\n')
{
fmt.Println(strings.TrimRight(line, "\n"))
}
然而,我不太喜欢在 for 循环中两次写 buf.ReadString("\n")
,有人有改进的建议吗?
英文:
I currently have this piece of code that will read a file line by line (delimited by a \n)
file, _ := os.Open(filename) //deal with the error later
defer file.Close()
buf := bufio.NewReader(file)
for line, err := buf.ReadString('\n'); err != io.EOF; line, err = buf.ReadString('\n')
{
fmt.Println(strings.TrimRight(line, "\n"))
}
However I don't feel comfortable with writing buf.ReadString("\n")
twice in the for loop, does anyone have any suggestions for improvement?
答案1
得分: 4
> bufio.ReadString 读取输入直到第一个出现的分隔符,返回一个包含数据和分隔符的字符串。如果在找到分隔符之前遇到错误,ReadString会返回错误之前读取的数据和错误本身(通常是io.EOF)。只有当返回的数据不以分隔符结尾时,ReadString才会返回err != nil。
如果buf.ReadString('\n')
返回的错误不是io.EOF
,例如bufio.ErrBufferFull
,你将会陷入无限循环。另外,如果文件不以\n
结尾,你会默默地忽略最后一个\n
之后的数据。
这里有一个更健壮的解决方案,只执行一次buf.ReadString('\n')
。
package main
import (
"bufio"
"fmt"
"io"
"os"
"strings"
)
func main() {
filename := "FileName"
file, err := os.Open(filename)
if err != nil {
fmt.Println(err)
return
}
defer file.Close()
buf := bufio.NewReader(file)
for {
line, err := buf.ReadString('\n')
if err != nil {
if err != io.EOF || len(line) > 0 {
fmt.Println(err)
return
}
break
}
fmt.Println(strings.TrimRight(line, "\n"))
}
}
英文:
> bufio.ReadString reads until the first occurrence of delim in the input,
> returning a string containing the data up to and including the
> delimiter. If ReadString encounters an error before finding a
> delimiter, it returns the data read before the error and the error
> itself (often io.EOF). ReadString returns err != nil if and only if
> the returned data does not end in delim.
If buf.ReadString('\n')
returns an error other than io.EOF
, for example bufio.ErrBufferFull
, you will be in an infinite loop. Also, if the file doesn't end in a '\n'
, you silently ignore the data after the last '\n'
.
Here's a more robust solution, which executes buf.ReadString('\n')
once.
package main
import (
"bufio"
"fmt"
"io"
"os"
"strings"
)
func main() {
filename := "FileName"
file, err := os.Open(filename)
if err != nil {
fmt.Println(err)
return
}
defer file.Close()
buf := bufio.NewReader(file)
for {
line, err := buf.ReadString('\n')
if err != nil {
if err != io.EOF || len(line) > 0 {
fmt.Println(err)
return
}
break
}
fmt.Println(strings.TrimRight(line, "\n"))
}
}
答案2
得分: 1
大多数逐行读取的代码可以通过不逐行读取来改进。如果你的目标是读取文件并访问每一行,以下类似的代码通常更好。
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
)
func main() {
b, err := ioutil.ReadFile("filename")
if err != nil {
log.Fatal(err)
}
s := string(b) // 将 []byte 转换为字符串
s = strings.TrimRight(s, "\n") // 去除最后一行的换行符
ss := strings.Split(s, "\n") // 分割为 []string
for _, s := range ss {
fmt.Println(s)
}
}
任何错误都会在一个地方处理,因此错误处理变得简化。去除最后一行的换行符可以处理可能有或没有最后一个换行符的文件,正如 Peter 所建议的。大多数文本文件相对于现在可用的内存来说都很小,因此一次性读取是适当的。
英文:
Most code that reads line by line can be improved by not reading line by line. If your goal is to read the file and access the lines, something like the following is almost always better.
package main
import (
"fmt"
"io/ioutil"
"log"
"strings"
)
func main() {
b, err := ioutil.ReadFile("filename")
if err != nil {
log.Fatal(err)
}
s := string(b) // convert []byte to string
s = strings.TrimRight(s, "\n") // strip \n on last line
ss := strings.Split(s, "\n") // split to []string
for _, s := range ss {
fmt.Println(s)
}
}
Any errors come to you at a single point so error handling is simplified. Stripping a newline off the last line allows for files that may or may not have that final newline, as Peter suggested. Most text files are tiny compared to available memory these days, so reading these in one gulp is appropriate.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论