Why after reusing buffalo.Is a New Scanner with the same reader causing a EOF , even though the source has not been fully read?

huangapple go评论102阅读模式
英文:

Why after reusing buffalo.Is a New Scanner with the same reader causing a EOF , even though the source has not been fully read?

问题

我有一个包含以下内容的文件:

  1. 3
  2. 09:00 19:00
  3. 10
  4. someText1
  5. someText2
  6. ...

首先,我想读取配置的前3行:

  1. sourceFile, err := os.OpenFile(fileName, os.O_RDONLY, 0644)
  2. cfg := ReadConfig(sourceFile)
  3. func ReadConfig(r io.Reader) Config {
  4. scanner := bufio.NewScanner(r)
  5. var cfg Config
  6. for i := 0; i < 3 && scanner.Scan(); i++ {
  7. switch i {
  8. case 0:
  9. cfg.Parts, _ = strconv.Atoi(scanner.Text())
  10. case 1:
  11. timeParts := strings.Split(scanner.Text(), " ")
  12. cfg.Start, _ = time.Parse("15:04", timeParts[0])
  13. cfg.End, _ = time.Parse("15:04", timeParts[1])
  14. case 2:
  15. cfg.Val, _ = strconv.Atoi(scanner.Text())
  16. }
  17. }
  18. return cfg
  19. }

它们正常读取,现在在另一个函数中,我想读取剩下的行:

  1. Read(ctx, sourceFile)
  2. func Read(ctx context.Context, r io.Reader) error {
  3. scanner := bufio.NewScanner(r)
  4. for scanner.Scan() {
  5. event := parseEvent(scanner.Text())
  6. }
  7. return nil
  8. }

但是读取不会发生,因为立即返回了EOF。为什么会这样,即使我只读取了3行?

英文:

I have a file with this content:

  1. 3
  2. 09:00 19:00
  3. 10
  4. someText1
  5. someText2
  6. ...

First I want to read the first 3 lines of the configuration:

  1. sourceFile, err := os.OpenFile(fileName, os.O_RDONLY, 0644)
  2. cfg := ReadConfig(sourceFile)
  3. func ReadConfig(r io.Reader) Config {
  4. scanner := bufio.NewScanner(r)
  5. var cfg Config
  6. for i := 0; i &lt; 3 &amp;&amp; scanner.Scan(); i++ {
  7. switch i {
  8. case 0:
  9. cfg.Parts, _ = strconv.Atoi(scanner.Text())
  10. case 1:
  11. timeParts := strings.Split(scanner.Text(), &quot; &quot;)
  12. cfg.Start, _ = time.Parse(&quot;15:04&quot;, timeParts[0])
  13. cfg.End, _ = time.Parse(&quot;15:04&quot;, timeParts[1])
  14. case 2:
  15. cfg.Val, _ = strconv.Atoi(scanner.Text())
  16. }
  17. }
  18. return cfg
  19. }

They are read normally and now in another function I want to read the remaining ones:

  1. Read(ctx, sourceFile)
  2. func Read(ctx context.Context, r io.Reader) error {
  3. scanner := bufio.NewScanner(r)
  4. for scanner.Scan() {
  5. event := parseEvent(scanner.Text())
  6. }
  7. return nil
  8. }

But reads do not occur because EOF is returned immediately. Why does this happen if I have read only 3 lines ?

答案1

得分: 1

bufio.Scanner在内部使用缓冲区(源代码):

  1. type Scanner struct {
  2. r io.Reader
  3. split SplitFunc
  4. maxTokenSize int
  5. token []byte
  6. buf []byte // 用作分割参数的缓冲区。
  7. //^^^^^^^^^^^^^^^^^
  8. start int
  9. end int
  10. err error
  11. empties int
  12. scanCalled bool
  13. done bool
  14. }

初始缓冲区大小为4096(源代码):

  1. const (
  2. startBufSize = 4096 // 用于缓冲区的初始分配大小。
  3. )

第一次调用Scan从底层读取器中最多读取startBufSize字节(源代码)。

你可以看到,尽管使用扫描器只读取了3行,但扫描器可能会将更多字节读入其缓冲区。

bufio.Reader具有类似的行为。请参考这个问题:https://stackoverflow.com/questions/76149992/multiple-arrow-csv-readers-on-same-file-returns-null。

英文:

bufio.Scanner uses a buffer internally (source code):

  1. type Scanner struct {
  2. r io.Reader
  3. split SplitFunc
  4. maxTokenSize int
  5. token []byte
  6. buf []byte // Buffer used as argument to split.
  7. //^^^^^^^^^^^^^^^^^
  8. start int
  9. end int
  10. err error
  11. empties int
  12. scanCalled bool
  13. done bool
  14. }

The initial buffer size is 4096 (source code):

  1. const (
  2. startBufSize = 4096 // Size of initial allocation for buffer.
  3. )

And the first call to Scan reads up to startBufSize bytes from the underlying reader (source code).

You see that although you only reads 3 lines using the scanner, the scanner could read far more bytes into its buffer.

bufio.Reader has a similar behavior. See this question: https://stackoverflow.com/questions/76149992/multiple-arrow-csv-readers-on-same-file-returns-null.

huangapple
  • 本文由 发表于 2023年7月9日 18:41:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/76647050.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定