英文:
Why after reusing buffalo.Is a New Scanner with the same reader causing a EOF , even though the source has not been fully read?
问题
我有一个包含以下内容的文件:
3
09:00 19:00
10
someText1
someText2
...
首先,我想读取配置的前3行:
sourceFile, err := os.OpenFile(fileName, os.O_RDONLY, 0644)
cfg := ReadConfig(sourceFile)
func ReadConfig(r io.Reader) Config {
scanner := bufio.NewScanner(r)
var cfg Config
for i := 0; i < 3 && scanner.Scan(); i++ {
switch i {
case 0:
cfg.Parts, _ = strconv.Atoi(scanner.Text())
case 1:
timeParts := strings.Split(scanner.Text(), " ")
cfg.Start, _ = time.Parse("15:04", timeParts[0])
cfg.End, _ = time.Parse("15:04", timeParts[1])
case 2:
cfg.Val, _ = strconv.Atoi(scanner.Text())
}
}
return cfg
}
它们正常读取,现在在另一个函数中,我想读取剩下的行:
Read(ctx, sourceFile)
func Read(ctx context.Context, r io.Reader) error {
scanner := bufio.NewScanner(r)
for scanner.Scan() {
event := parseEvent(scanner.Text())
}
return nil
}
但是读取不会发生,因为立即返回了EOF。为什么会这样,即使我只读取了3行?
英文:
I have a file with this content:
3
09:00 19:00
10
someText1
someText2
...
First I want to read the first 3 lines of the configuration:
sourceFile, err := os.OpenFile(fileName, os.O_RDONLY, 0644)
cfg := ReadConfig(sourceFile)
func ReadConfig(r io.Reader) Config {
scanner := bufio.NewScanner(r)
var cfg Config
for i := 0; i < 3 && scanner.Scan(); i++ {
switch i {
case 0:
cfg.Parts, _ = strconv.Atoi(scanner.Text())
case 1:
timeParts := strings.Split(scanner.Text(), " ")
cfg.Start, _ = time.Parse("15:04", timeParts[0])
cfg.End, _ = time.Parse("15:04", timeParts[1])
case 2:
cfg.Val, _ = strconv.Atoi(scanner.Text())
}
}
return cfg
}
They are read normally and now in another function I want to read the remaining ones:
Read(ctx, sourceFile)
func Read(ctx context.Context, r io.Reader) error {
scanner := bufio.NewScanner(r)
for scanner.Scan() {
event := parseEvent(scanner.Text())
}
return nil
}
But reads do not occur because EOF is returned immediately. Why does this happen if I have read only 3 lines ?
答案1
得分: 1
bufio.Scanner
在内部使用缓冲区(源代码):
type Scanner struct {
r io.Reader
split SplitFunc
maxTokenSize int
token []byte
buf []byte // 用作分割参数的缓冲区。
//^^^^^^^^^^^^^^^^^
start int
end int
err error
empties int
scanCalled bool
done bool
}
初始缓冲区大小为4096(源代码):
const (
startBufSize = 4096 // 用于缓冲区的初始分配大小。
)
第一次调用Scan
从底层读取器中最多读取startBufSize
字节(源代码)。
你可以看到,尽管使用扫描器只读取了3行,但扫描器可能会将更多字节读入其缓冲区。
bufio.Reader
具有类似的行为。请参考这个问题:https://stackoverflow.com/questions/76149992/multiple-arrow-csv-readers-on-same-file-returns-null。
英文:
bufio.Scanner
uses a buffer internally (source code):
type Scanner struct {
r io.Reader
split SplitFunc
maxTokenSize int
token []byte
buf []byte // Buffer used as argument to split.
//^^^^^^^^^^^^^^^^^
start int
end int
err error
empties int
scanCalled bool
done bool
}
The initial buffer size is 4096 (source code):
const (
startBufSize = 4096 // Size of initial allocation for buffer.
)
And the first call to Scan
reads up to startBufSize
bytes from the underlying reader (source code).
You see that although you only reads 3 lines using the scanner, the scanner could read far more bytes into its buffer.
bufio.Reader
has a similar behavior. See this question: https://stackoverflow.com/questions/76149992/multiple-arrow-csv-readers-on-same-file-returns-null.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论