英文:
SQL result to JSON as fast as possible
问题
我正在尝试将Go内置的SQL结果转换为JSON。我正在使用goroutines,但遇到了问题。
基本问题:
有一个非常大的数据库,大约有20万个用户,我必须通过TCP套接字在基于微服务的系统中为它们提供服务。从数据库获取用户花费了20毫秒,但使用当前解决方案将这批数据转换为JSON花费了10秒。这就是为什么我想使用goroutines的原因。
使用Goroutines的解决方案:
func getJSON(rows *sql.Rows, cnf configure.Config) ([]byte, error) {
log := logan.Log{
Cnf: cnf,
}
cols, _ := rows.Columns()
defer rows.Close()
done := make(chan struct{})
go func() {
defer close(done)
for result := range resultChannel {
results = append(
results,
result,
)
}
}()
wg.Add(1)
go func() {
for rows.Next() {
wg.Add(1)
go handleSQLRow(cols, rows)
}
wg.Done()
}()
go func() {
wg.Wait()
defer close(resultChannel)
}()
<-done
s, err := json.Marshal(results)
results = []resultContainer{}
if err != nil {
log.Context(1).Error(err)
}
rows.Close()
return s, nil
}
func handleSQLRow(cols []string, rows *sql.Rows) {
defer wg.Done()
result := make(map[string]string, len(cols))
fmt.Println("asd -> " + strconv.Itoa(counter))
counter++
rawResult := make([][]byte, len(cols))
dest := make([]interface{}, len(cols))
for i := range rawResult {
dest[i] = &rawResult[i]
}
rows.Scan(dest...) // GET PANIC
for i, raw := range rawResult {
if raw == nil {
result[cols[i]] = ""
} else {
fmt.Println(string(raw))
result[cols[i]] = string(raw)
}
}
resultChannel <- result
}
这个解决方案给我一个恐慌,错误信息如下:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x45974c]
goroutine 408 [running]:
panic(0x7ca140, 0xc420010150)
/usr/lib/golang/src/runtime/panic.go:500 +0x1a1
database/sql.convertAssign(0x793960, 0xc420529210, 0x7a5240, 0x0, 0x0, 0x0)
/usr/lib/golang/src/database/sql/convert.go:88 +0x1ef1
database/sql.(*Rows).Scan(0xc4203e4060, 0xc42021fb00, 0x44, 0x44, 0x44, 0x44)
/usr/lib/golang/src/database/sql/sql.go:1850 +0xc2
github.com/PumpkinSeed/zerodb/operations.handleSQLRow(0xc420402000, 0x44, 0x44, 0xc4203e4060)
/home/loow/gopath/src/github.com/PumpkinSeed/zerodb/operations/operations.go:290 +0x19c
created by github.com/PumpkinSeed/zerodb/operations.getJSON.func2
/home/loow/gopath/src/github.com/PumpkinSeed/zerodb/operations/operations.go:258 +0x91
exit status 2
当前的解决方案虽然有效,但花费太多时间:
func getJSON(rows *sql.Rows, cnf configure.Config) ([]byte, error) {
log := logan.Log{
Cnf: cnf,
}
var results []resultContainer
cols, _ := rows.Columns()
rawResult := make([][]byte, len(cols))
dest := make([]interface{}, len(cols))
for i := range rawResult {
dest[i] = &rawResult[i]
}
defer rows.Close()
for rows.Next() {
result := make(map[string]string, len(cols))
rows.Scan(dest...)
for i, raw := range rawResult {
if raw == nil {
result[cols[i]] = ""
} else {
result[cols[i]] = string(raw)
}
}
results = append(results, result)
}
s, err := json.Marshal(results)
if err != nil {
log.Context(1).Error(err)
}
rows.Close()
return s, nil
}
问题:
为什么goroutine解决方案会给我一个错误,而不是明显的恐慌,因为前面大约200个goroutine都正常运行?
更新:
原始工作解决方案的性能测试:
INFO[0020] setup taken -> 3.149124658s file=operations.go func=operations.getJSON line=260 service="Database manager" ts="2017-04-02 19:45:27.132881211 +0100 BST"
INFO[0025] toJSON taken -> 5.317647046s file=operations.go func=operations.getJSON line=263 service="Database manager" ts="2017-04-02 19:45:32.450551417 +0100 BST"
将SQL映射为JSON花费了3秒,将JSON转换为JSON花费了5秒。
英文:
I'm trying to transform the Go built-in sql result to JSON. I'm using goroutines for that but I got problems.
The base problem:
There is a really big database with around 200k user and I have to serve them through tcp sockets in a microservice based system. To get the users from the database spent 20ms but transform this bunch of data to JSON spend 10 seconds with the current solution. This is why I want to use goroutines.
Solution with Goroutines:
func getJSON(rows *sql.Rows, cnf configure.Config) ([]byte, error) {
log := logan.Log{
Cnf: cnf,
}
cols, _ := rows.Columns()
defer rows.Close()
done := make(chan struct{})
go func() {
defer close(done)
for result := range resultChannel {
results = append(
results,
result,
)
}
}()
wg.Add(1)
go func() {
for rows.Next() {
wg.Add(1)
go handleSQLRow(cols, rows)
}
wg.Done()
}()
go func() {
wg.Wait()
defer close(resultChannel)
}()
<-done
s, err := json.Marshal(results)
results = []resultContainer{}
if err != nil {
log.Context(1).Error(err)
}
rows.Close()
return s, nil
}
func handleSQLRow(cols []string, rows *sql.Rows) {
defer wg.Done()
result := make(map[string]string, len(cols))
fmt.Println("asd -> " + strconv.Itoa(counter))
counter++
rawResult := make([][]byte, len(cols))
dest := make([]interface{}, len(cols))
for i := range rawResult {
dest[i] = &rawResult[i]
}
rows.Scan(dest...) // GET PANIC
for i, raw := range rawResult {
if raw == nil {
result[cols[i]] = ""
} else {
fmt.Println(string(raw))
result[cols[i]] = string(raw)
}
}
resultChannel <- result
}
This solution give me a panic with the following message:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x45974c]
goroutine 408 [running]:
panic(0x7ca140, 0xc420010150)
/usr/lib/golang/src/runtime/panic.go:500 +0x1a1
database/sql.convertAssign(0x793960, 0xc420529210, 0x7a5240, 0x0, 0x0, 0x0)
/usr/lib/golang/src/database/sql/convert.go:88 +0x1ef1
database/sql.(*Rows).Scan(0xc4203e4060, 0xc42021fb00, 0x44, 0x44, 0x44, 0x44)
/usr/lib/golang/src/database/sql/sql.go:1850 +0xc2
github.com/PumpkinSeed/zerodb/operations.handleSQLRow(0xc420402000, 0x44, 0x44, 0xc4203e4060)
/home/loow/gopath/src/github.com/PumpkinSeed/zerodb/operations/operations.go:290 +0x19c
created by github.com/PumpkinSeed/zerodb/operations.getJSON.func2
/home/loow/gopath/src/github.com/PumpkinSeed/zerodb/operations/operations.go:258 +0x91
exit status 2
The current solution which is working but spend too much time:
func getJSON(rows *sql.Rows, cnf configure.Config) ([]byte, error) {
log := logan.Log{
Cnf: cnf,
}
var results []resultContainer
cols, _ := rows.Columns()
rawResult := make([][]byte, len(cols))
dest := make([]interface{}, len(cols))
for i := range rawResult {
dest[i] = &rawResult[i]
}
defer rows.Close()
for rows.Next() {
result := make(map[string]string, len(cols))
rows.Scan(dest...)
for i, raw := range rawResult {
if raw == nil {
result[cols[i]] = ""
} else {
result[cols[i]] = string(raw)
}
}
results = append(results, result)
}
s, err := json.Marshal(results)
if err != nil {
log.Context(1).Error(err)
}
rows.Close()
return s, nil
}
Question:
Why the goroutine solution give me an error, where it is not an obvious panic, because the first ~200 goroutine running properly?!
UPDATE
Performance test for the original working solution:
INFO[0020] setup taken -> 3.149124658s file=operations.go func=operations.getJSON line=260 service="Database manager" ts="2017-04-02 19:45:27.132881211 +0100 BST"
INFO[0025] toJSON taken -> 5.317647046s file=operations.go func=operations.getJSON line=263 service="Database manager" ts="2017-04-02 19:45:32.450551417 +0100 BST"
The sql to map is 3 sec and to json is 5 sec.
答案1
得分: 0
Go协程在像JSON编组这样的CPU绑定操作上不会提高性能。你需要的是一个更高效的JSON编组器。虽然我没有使用过,但有一些可用的选项。简单地在Google上搜索“更快的JSON编组”将会得到很多结果。一个受欢迎的选项是ffjson。我建议从那里开始。
英文:
Go routines won't improve performance on CPU-bound operations like JSON marshaling. What you need is a more efficient JSON marshaler. There are some available, although I haven't used any. A simple Google search for 'faster JSON marshaling' will turn up many results. A popular one is ffjson. I suggest starting there.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论