问题

我正在使用Go语言开发一个命令行工具，名为redis-mass，它可以将一系列的Redis命令转换为Redis协议格式。

首先，我将Node.js版本几乎完全移植到了Go语言。我使用了ioutil.ReadFile(inputFileName)来获取文件的字符串版本，然后将编码后的字符串作为输出。

当我在一个包含200万个Redis命令的文件上运行时，耗时约为8秒，而Node.js版本大约需要16秒。我猜想之所以只快了两倍，是因为它首先将整个文件读入内存，所以我修改了编码函数，使其接受一个(raw io.Reader, enc io.Writer)的参数对，代码如下：

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
        command := strings.TrimSpace(scanner.Text())
        args = parse(command)
        length = len(args)
        if length > 0 {
            io.WriteString(enc, fmt.Sprintf("*%d\r\n", length))
            for _, arg := range args {
                io.WriteString(enc, fmt.Sprintf("$%d\r\n%s\r\n", len(arg), arg))
            }
        }
    }
}

然而，在这个200万行的文件上，这个方法耗时12秒，所以我使用了github.com/pkg/profile来查看它如何使用内存，结果发现内存分配的数量非常大：

# Alloc = 3162912
# TotalAlloc = 1248612816
# Mallocs = 46001048
# HeapAlloc = 3162912

我能否限制io.Writer使用固定大小的缓冲区，并避免所有这些内存分配？

更一般地说，我该如何避免这个方法中的过多内存分配？这里是完整的源代码，以便更好地理解上下文。

英文:

I am working on a command line tool in Go called redis-mass that converts a bunch of redis commands into redis protocol format.

The first step was to port the node.js version, almost literally to Go. I used ioutil.ReadFile(inputFileName) to get a string version of the file and then returned an encoded string as output.

When I ran this on a file with 2,000,000 redis commands, it took about 8 seconds, compared to about 16 seconds with the node version. I guessed that the reason it was only twice as fast was because it was reading the whole file into memory first, so I changed my encoding function to accept a pair (raw io.Reader, enc io.Writer), and it looks like this:

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
            command := strings.TrimSpace(scanner.Text())
            args = parse(command)
            length = len(args)
            if length &gt; 0 {
                    io.WriteString(enc, fmt.Sprintf(&quot;*%d\r\n&quot;, length))
                    for _, arg := range args {
                            io.WriteString(enc, fmt.Sprintf(&quot;$%d\r\n%s\r\n&quot;, len(arg), arg))
                    }
            }
    }
}

However, this took 12 seconds on the 2 million line file, so I used github.com/pkg/profile to see how it was using memory, and it looks like the number of memory allocations is huge:

# Alloc = 3162912
# TotalAlloc = 1248612816
# Mallocs = 46001048
# HeapAlloc = 3162912

Can I constrain the io.Writer to use a fixed sized buffer and avoid all those allocations?

More generally, how can I avoid excessive allocations in this method? Here's the full source for more context

答案1

得分: 1

通过使用[]byte而不是字符串来减少分配。直接使用fmt.Printf输出而不是fmt.Sprintf和io.WriteString。

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
        command := bytes.TrimSpace(scanner.Bytes())
        args = parse(command)
        length = len(args)
        if length > 0 {
            fmt.Fprintf(enc, "*%d\r\n", length)
            for _, arg := range args {
                fmt.Fprintf(enc, "$%d\r\n%s\r\n", len(arg), arg)
            }
        }
    }
}

希望对你有帮助！

英文:

Reduce allocations by working with []byte instead of strings. fmt.Printf directly to the output instead of fmt.Sprintf and io.WriteString.

func EncodeStream(raw io.Reader, enc io.Writer) {
    var args []string
    var length int

    scanner := bufio.NewScanner(raw)

    for scanner.Scan() {
            command := bytes.TrimSpace(scanner.Bytes())
            args = parse(command)
            length = len(args)
            if length &gt; 0 {
                    fmt.Printf(enc, &quot;*%d\r\n&quot;, length))
                    for _, arg := range args {
                           fmt.Printf(enc, &quot;$%d\r\n%s\r\n&quot;, len(arg), arg))
                    }
            }
    }
}

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在使用io.Writer时，如何避免在golang中过多地分配内存。

问题

答案1

使用go test（golang）找到测试失败的文件名。

构建用于Golang代码的Docker容器：包PACKAGE_NAME不在GOROOT中。

如何在使用golang的远程selenium中加载Chrome扩展程序？

Golang无法将头部写入tar包？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论