Scanner.Buffer – max value has no effect on custom Split?

huangapple go评论75阅读模式
英文:

Scanner.Buffer - max value has no effect on custom Split?

问题

为了减少默认的64k扫描缓冲区(适用于内存较低的微型计算机),我尝试使用该缓冲区和自定义的分割函数:

scanner.Buffer(make([]byte, 5120), 64)
scanner.Split(Scan64Bytes)

在这里我注意到第二个缓冲区参数"max"没有效果。如果我插入015120bufio.MaxScanTokenSize,我看不到任何区别。只有第一个参数"buf"有影响。如果容量太小,扫描将不完整,如果容量太大,B/op benchmem值会增加。

根据文档:

最大标记大小是max和cap(buf)中较大的值。如果max <= cap(buf),Scan将仅使用该缓冲区,不进行分配。

我不明白正确的max值是多少。你可以解释一下吗?

英文:

To reduce the default 64k scanner buffer (for microcomputer with low memory), I try to use this buffer and custom split functions:

scanner.Buffer(make([]byte, 5120), 64)
scanner.Split(Scan64Bytes)

Here I noticed that the second buffer argument "max" has no effect. If I instead insert e.g. 0, 1, 5120 or bufio.MaxScanTokenSize, I can' t see any difference.
Only the first argument "buf" has consequences. Is the capacity to small the scan is incomplete and if it's to large the B/op benchmem value increases.

From the doc:
> The maximum token size is the larger of max and cap(buf). If max <= cap(buf), Scan will use this buffer only and do no allocation.

I don't understand which is the correct max value. Can you maybe explain this to me, please?

Go Playground

package main

import (
	&quot;bufio&quot;
	&quot;bytes&quot;
	&quot;fmt&quot;
)

func Scan64Bytes(data []byte, atEOF bool) (advance int, token []byte, err error) {
	if len(data) &lt; 64 {
		return 0, data[0:], bufio.ErrFinalToken
	}
	return 64, data[0:64], nil
}

func main() {
	// improvised source of the same size:
	cmdstd := bytes.NewReader(make([]byte, 5120))
	scanner := bufio.NewScanner(cmdstd)

	// I guess 64 is the correct max arg:
	scanner.Buffer(make([]byte, 5120), 64)
	scanner.Split(Scan64Bytes)

	for i := 0; scanner.Scan(); i++ {
		fmt.Printf(&quot;%v: %v\r\n&quot;, i, scanner.Bytes())
	}

	if err := scanner.Err(); err != nil {
		fmt.Println(err)
	}
}

答案1

得分: 1

“max value has no effect on custom Split?”的翻译如下:

不,没有split的情况下结果是一样的。但是如果没有split和ErrFinalToken,这是不可能的:

// 你的读取器/输入
cmdstd := bytes.NewReader(make([]byte, 5120))

// 你的扫描器缓冲区大小
scanner.Buffer(make([]byte, 5120), 64)

扫描器的缓冲区大小应该更大。这是我设置buf和max的方式:

scanner.Buffer(make([]byte, 5121), 5120)
英文:

> max value has no effect on custom Split?

No, without split there is the same result. But this wouldn't be possible without split and ErrFinalToken:

//your reader/input
cmdstd := bytes.NewReader(make([]byte, 5120))

// your scanner buffer size
scanner.Buffer(make([]byte, 5120), 64)

The buffer size from the scanner should be larger. This is how I would set buf and max:

scanner.Buffer(make([]byte, 5121), 5120)

huangapple
  • 本文由 发表于 2022年4月9日 05:08:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/71803150.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定