在使用io.Writer时,如何避免在golang中过多地分配内存。

huangapple go评论94阅读模式
英文:

Avoiding excessive memory allocation in golang when using an io.Writer

问题

我正在使用Go语言开发一个命令行工具,名为redis-mass,它可以将一系列的Redis命令转换为Redis协议格式

首先,我将Node.js版本几乎完全移植到了Go语言。我使用了ioutil.ReadFile(inputFileName)来获取文件的字符串版本,然后将编码后的字符串作为输出。

当我在一个包含200万个Redis命令的文件上运行时,耗时约为8秒,而Node.js版本大约需要16秒。我猜想之所以只快了两倍,是因为它首先将整个文件读入内存,所以我修改了编码函数,使其接受一个(raw io.Reader, enc io.Writer)的参数对,代码如下:

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
        command := strings.TrimSpace(scanner.Text())
        args = parse(command)
        length = len(args)
        if length > 0 {
            io.WriteString(enc, fmt.Sprintf("*%d\r\n", length))
            for _, arg := range args {
                io.WriteString(enc, fmt.Sprintf("$%d\r\n%s\r\n", len(arg), arg))
            }
        }
    }
}

然而,在这个200万行的文件上,这个方法耗时12秒,所以我使用了github.com/pkg/profile来查看它如何使用内存,结果发现内存分配的数量非常大:

# Alloc = 3162912
# TotalAlloc = 1248612816
# Mallocs = 46001048
# HeapAlloc = 3162912

我能否限制io.Writer使用固定大小的缓冲区,并避免所有这些内存分配?

更一般地说,我该如何避免这个方法中的过多内存分配?这里是完整的源代码,以便更好地理解上下文

英文:

I am working on a command line tool in Go called redis-mass that converts a bunch of redis commands into redis protocol format.

The first step was to port the node.js version, almost literally to Go. I used ioutil.ReadFile(inputFileName) to get a string version of the file and then returned an encoded string as output.

When I ran this on a file with 2,000,000 redis commands, it took about 8 seconds, compared to about 16 seconds with the node version. I guessed that the reason it was only twice as fast was because it was reading the whole file into memory first, so I changed my encoding function to accept a pair (raw io.Reader, enc io.Writer), and it looks like this:

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
            command := strings.TrimSpace(scanner.Text())
            args = parse(command)
            length = len(args)
            if length > 0 {
                    io.WriteString(enc, fmt.Sprintf("*%d\r\n", length))
                    for _, arg := range args {
                            io.WriteString(enc, fmt.Sprintf("$%d\r\n%s\r\n", len(arg), arg))
                    }
            }
    }
}

However, this took 12 seconds on the 2 million line file, so I used github.com/pkg/profile to see how it was using memory, and it looks like the number of memory allocations is huge:

# Alloc = 3162912
# TotalAlloc = 1248612816
# Mallocs = 46001048
# HeapAlloc = 3162912

Can I constrain the io.Writer to use a fixed sized buffer and avoid all those allocations?

More generally, how can I avoid excessive allocations in this method? Here's the full source for more context

答案1

得分: 1

通过使用[]byte而不是字符串来减少分配。直接使用fmt.Printf输出而不是fmt.Sprintf和io.WriteString。

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
        command := bytes.TrimSpace(scanner.Bytes())
        args = parse(command)
        length = len(args)
        if length > 0 {
            fmt.Fprintf(enc, "*%d\r\n", length)
            for _, arg := range args {
                fmt.Fprintf(enc, "$%d\r\n%s\r\n", len(arg), arg)
            }
        }
    }
}

希望对你有帮助!

英文:

Reduce allocations by working with []byte instead of strings. fmt.Printf directly to the output instead of fmt.Sprintf and io.WriteString.

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
            command := bytes.TrimSpace(scanner.Bytes())
            args = parse(command)
            length = len(args)
            if length > 0 {
                    fmt.Printf(enc, "*%d\r\n", length))
                    for _, arg := range args {
                           fmt.Printf(enc, "$%d\r\n%s\r\n", len(arg), arg))
                    }
            }
    }
}

huangapple
  • 本文由 发表于 2016年1月21日 01:30:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/34906668.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定