2014年4月1日 06:26:25go评论124阅读模式

英文:

Improving the performance of rows.Scan() in Go

问题

我有一个非常简单的查询，返回了几千行数据，只有两列：

SELECT "id", "value" FROM "table" LIMIT 10000;

在使用sql.Query()执行查询之后，我使用以下代码遍历结果集：

data := map[uint8]string{}

for rows.Next() {
    var (
        id    uint8
        value string
    )

    if error := rows.Scan(&id, &value); error == nil {
        data[id] = value
    }
}

如果我直接在数据库上运行完全相同的查询，几毫秒内就能得到所有结果，但是Go代码需要更长的时间，有时甚至接近10秒！

我开始注释掉代码的几个部分，似乎rows.Scan()是罪魁祸首。

Scan将当前行的列复制到dest指向的值中。

如果参数的类型是*[]byte，Scan将在该参数中保存相应数据的副本。副本由调用者拥有，可以进行修改并持有无限期。可以通过使用类型为RawBytes的参数来避免复制；有关其使用限制的详细信息，请参阅RawBytes的文档。如果参数的类型是interface{}，Scan将在不进行转换的情况下复制底层驱动程序提供的值。如果值的类型是[]byte，将进行复制，并且调用者拥有结果。

如果我使用*[]byte、*RawBytes或*interface{}，是否可以期望获得速度上的改进？

查看代码，convertAssign()函数似乎做了很多对于这个特定查询来说不必要的工作。所以我的问题是：如何使Scan过程更快？

我考虑过重载函数以期望预定的类型，但在Go中这是不可能的...

有什么想法吗？

英文:

I have a very simple query that returns a couple thousand rows with only two columns:

SELECT &quot;id&quot;, &quot;value&quot; FROM &quot;table&quot; LIMIT 10000;

After issuing sql.Query(), I traverse the result set with the following code:

data := map[uint8]string{}

for rows.Next() {
	var (
		id     uint8
		value  string
	)

	if error := rows.Scan(&amp;id, &amp;value); error == nil {
		data[id] = value
	}
}

If I run the exact same query directly on the database, I get all results back within a couple of milliseconds, but the Go code takes far longer complete, sometimes almost 10 seconds!

I started commenting out several parts of the code and it seems that rows.Scan() is the culprit.

> Scan copies the columns in the current row into the values pointed at
> by dest.
>
> If an argument has type *[]byte, Scan saves in that argument a copy of
> the corresponding data. The copy is owned by the caller and can be
> modified and held indefinitely. The copy can be avoided by using an
> argument of type *RawBytes instead; see the documentation for RawBytes
> for restrictions on its use. If an argument has type *interface{},
> Scan copies the value provided by the underlying driver without
> conversion. If the value is of type []byte, a copy is made and the
> caller owns the result.

Can any expect any speed improvement if I use *[]byte, *RawBytes or *interface{} instead?

Looking at the code, it looks like the convertAssign() function is doing a lot of stuff that isn't necessary for this particular query. So my question is: how can I make the Scan process faster?

I thought about overloading the function to expect predetermined types, but that isn't possible in Go...

Any ideas?

答案1

得分: 4

是的，你可以使用RawBytes，并且rows.Scan()将避免内存分配/复制。

关于convertAssign()函数-是的，在Go 1.2中它不是最优的，但是在1.3中有了显著的改进：

http://code.google.com/p/go/issues/detail?id=7086
sync.Pool的无锁实现

我有一些使用RawBytes的示例-https://gist.github.com/yvasiyarov/9911956

这段代码从MySQL表中读取数据，进行一些处理，并将其写入CSV文件。昨晚它花了1分24秒来生成4GB的CSV数据（约3000万行）

所以我非常确定问题不在于Go代码：即使是最糟糕的rows.Scan()使用方式也不会导致10秒的延迟。

英文:

yes, you can use RawBytes instead and rows.Scan() will avoid memory allocation/copying

About convertAssign() function - yes, its not optimal in Go 1.2,
but they make significant improvements in 1.3:

http://code.google.com/p/go/issues/detail?id=7086
Lock-less implementation for sync.Pool

I have some example of RawBytes usage - https://gist.github.com/yvasiyarov/9911956

This code read data from MySQL table, make some processing and write it to CSV files.
Last night it takes 1 minute 24 seconds to generate 4GB of CSV data(about 30 million rows)

so I'm pretty sure what problem is outside of go code: even worse possible usage of rows.Scan() can not give you 10 seconds delay.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

提高Go语言中rows.Scan()的性能

问题

答案1

How to efficiently replace strings occurrences between two strings delimiters using Go bytes?

在使用Golang绑定库torrent时，如何转换为“alert”类型。

Golang在Buffalo的database.yml中使用bash环境变量。

Go的HTTP服务器如何处理POST数据的差异？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论