Why after reusing buffalo.Is a New Scanner with the same reader causing a EOF , even though the source has not been fully read?

huangapple go评论62阅读模式
英文:

Why after reusing buffalo.Is a New Scanner with the same reader causing a EOF , even though the source has not been fully read?

问题

我有一个包含以下内容的文件:

3
09:00 19:00
10
someText1
someText2
...

首先,我想读取配置的前3行:

sourceFile, err := os.OpenFile(fileName, os.O_RDONLY, 0644)
cfg := ReadConfig(sourceFile)

func ReadConfig(r io.Reader) Config {
	scanner := bufio.NewScanner(r)
	var cfg Config
	for i := 0; i < 3 && scanner.Scan(); i++ {
		switch i {
		case 0:
			cfg.Parts, _ = strconv.Atoi(scanner.Text())
		case 1:
			timeParts := strings.Split(scanner.Text(), " ")
			cfg.Start, _ = time.Parse("15:04", timeParts[0])
			cfg.End, _ = time.Parse("15:04", timeParts[1])
		case 2:
			cfg.Val, _ = strconv.Atoi(scanner.Text())
		}
	}
	return cfg
}

它们正常读取,现在在另一个函数中,我想读取剩下的行:

Read(ctx, sourceFile)

func Read(ctx context.Context, r io.Reader) error {
	scanner := bufio.NewScanner(r)
	for scanner.Scan() {
		event := parseEvent(scanner.Text())
	}
	return nil
}

但是读取不会发生,因为立即返回了EOF。为什么会这样,即使我只读取了3行?

英文:

I have a file with this content:

3
09:00 19:00
10
someText1
someText2
... 

First I want to read the first 3 lines of the configuration:

sourceFile, err := os.OpenFile(fileName, os.O_RDONLY, 0644)
cfg := ReadConfig(sourceFile)

func ReadConfig(r io.Reader) Config {
	scanner := bufio.NewScanner(r)
	var cfg Config
	for i := 0; i &lt; 3 &amp;&amp; scanner.Scan(); i++ {
		switch i {
		case 0:
			cfg.Parts, _ = strconv.Atoi(scanner.Text())
		case 1:
			timeParts := strings.Split(scanner.Text(), &quot; &quot;)
			cfg.Start, _ = time.Parse(&quot;15:04&quot;, timeParts[0])
			cfg.End, _ = time.Parse(&quot;15:04&quot;, timeParts[1])
		case 2:
			cfg.Val, _ = strconv.Atoi(scanner.Text())
		}
	}
	return cfg

}

They are read normally and now in another function I want to read the remaining ones:

Read(ctx, sourceFile)

func Read(ctx context.Context, r io.Reader) error {
	scanner := bufio.NewScanner(r)
	for scanner.Scan() {
			event := parseEvent(scanner.Text())
	}
	return nil
}

But reads do not occur because EOF is returned immediately. Why does this happen if I have read only 3 lines ?

答案1

得分: 1

bufio.Scanner在内部使用缓冲区(源代码):

type Scanner struct {
	r            io.Reader
	split        SplitFunc
	maxTokenSize int
	token        []byte
	buf          []byte    // 用作分割参数的缓冲区。
	//^^^^^^^^^^^^^^^^^
	start        int
	end          int
	err          error
	empties      int
	scanCalled   bool
	done         bool
}

初始缓冲区大小为4096(源代码):

const (
	startBufSize = 4096 // 用于缓冲区的初始分配大小。
)

第一次调用Scan从底层读取器中最多读取startBufSize字节(源代码)。

你可以看到,尽管使用扫描器只读取了3行,但扫描器可能会将更多字节读入其缓冲区。

bufio.Reader具有类似的行为。请参考这个问题:https://stackoverflow.com/questions/76149992/multiple-arrow-csv-readers-on-same-file-returns-null。

英文:

bufio.Scanner uses a buffer internally (source code):

type Scanner struct {
	r            io.Reader
	split        SplitFunc
	maxTokenSize int
	token        []byte
	buf          []byte    // Buffer used as argument to split.
	//^^^^^^^^^^^^^^^^^
	start        int
	end          int
	err          error
	empties      int
	scanCalled   bool
	done         bool
}

The initial buffer size is 4096 (source code):

const (
	startBufSize = 4096 // Size of initial allocation for buffer.
)

And the first call to Scan reads up to startBufSize bytes from the underlying reader (source code).

You see that although you only reads 3 lines using the scanner, the scanner could read far more bytes into its buffer.

bufio.Reader has a similar behavior. See this question: https://stackoverflow.com/questions/76149992/multiple-arrow-csv-readers-on-same-file-returns-null.

huangapple
  • 本文由 发表于 2023年7月9日 18:41:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76647050.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定