英文:
Is binary.Read slow?
问题
我正在将一个旧的小型C项目重写为Go(为了学习Go)。
该项目基本上从文件中读取一些二进制数据,对这些数据进行一些过滤,然后将其打印到stdout。
代码的主要部分如下所示(省略了错误处理):
type netFlowRow struct {
Timestamp uint32
Srcip [4]byte
Dstip [4]byte
Proto uint16
Srcport uint16
Dstport uint16
Pkt uint32
Size uint64
}
func main() {
// ...
file, _ := os.Open(path)
for j := 0; j < infoRow.Count; j++ {
netRow := netFlowRow{}
binary.Read(file, binary.BigEndian, &netRow)
// ...
fmt.Printf("%v", netRow)
}
}
在进行了一个简单的重写后,Go版本的运行速度比C版本慢了10倍(约40秒对比2-3秒)。我使用pprof进行了性能分析,结果显示如下:
(pprof) top10
39.96秒中的40.39秒总时间(98.94%)
删除了71个节点(累计时间 <= 0.20秒)
显示了11个节点中的前10个(累计时间 >= 39.87秒)
flat flat% sum% cum cum%
39.87秒 98.71% 98.71% 39.87秒 98.71% syscall.Syscall
0.09秒 0.22% 98.94% 40.03秒 99.11% encoding/binary.Read
0 0% 98.94% 39.87秒 98.71% io.ReadAtLeast
0 0% 98.94% 39.87秒 98.71% io.ReadFull
0 0% 98.94% 40.03秒 99.11% main.main
0 0% 98.94% 39.87秒 98.71% os.(*File).Read
0 0% 98.94% 39.87秒 98.71% os.(*File).read
0 0% 98.94% 40.21秒 99.55% runtime.goexit
0 0% 98.94% 40.03秒 99.11% runtime.main
0 0% 98.94% 39.87秒 98.71% syscall.Read
我理解得对吗?syscall.Syscall基本上是主要的时间消耗者吗?这是文件读取的地方吗?
更新:
我使用了bufio.Reader并得到了以下性能分析结果:
(pprof) top10
34.16秒中的36秒总时间(94.89%)
删除了99个节点(累计时间 <= 0.18秒)
显示了33个节点中的前10个(累计时间 >= 0.56秒)
flat flat% sum% cum cum%
31.99秒 88.86% 88.86% 32秒 88.89% syscall.Syscall
0.43秒 1.19% 90.06% 0.64秒 1.78% runtime.mallocgc
0.39秒 1.08% 91.14% 1.06秒 2.94% encoding/binary.(*decoder).value
0.28秒 0.78% 91.92% 0.99秒 2.75% reflect.(*structType).Field
0.28秒 0.78% 92.69% 0.28秒 0.78% runtime.duffcopy
0.24秒 0.67% 93.36% 1.64秒 4.56% encoding/binary.sizeof
0.22秒 0.61% 93.97% 34.51秒 95.86% encoding/binary.Read
0.22秒 0.61% 94.58% 0.22秒 0.61% runtime.mach_semaphore_signal
0.07秒 0.19% 94.78% 1.28秒 3.56% reflect.(*rtype).Field
0.04秒 0.11% 94.89% 0.56秒 1.56% runtime.newobject
英文:
I'm rewriting an old small C project into Go (to learn Go).
The project basically reads some binary data from a file, does some filtering on said data, then prints it into stdout.
The main part of the code looks like this (omitting error handling):
type netFlowRow struct {
Timestamp uint32
Srcip [4]byte
Dstip [4]byte
Proto uint16
Srcport uint16
Dstport uint16
Pkt uint32
Size uint64
}
func main() {
// ...
file, _ := os.Open(path)
for j := 0; j < infoRow.Count; j++ {
netRow := netFlowRow{}
binary.Read(file, binary.BigEndian, &netRow)
// ...
fmt.Printf("%v", netRow)
}
}
After doing a naive rewrite go version ran 10 times slower than the C version (~40s vs 2-3s). I did profiling with pprof and it showed me this:
(pprof) top10
39.96s of 40.39s total (98.94%)
Dropped 71 nodes (cum <= 0.20s)
Showing top 10 nodes out of 11 (cum >= 39.87s)
flat flat% sum% cum cum%
39.87s 98.71% 98.71% 39.87s 98.71% syscall.Syscall
0.09s 0.22% 98.94% 40.03s 99.11% encoding/binary.Read
0 0% 98.94% 39.87s 98.71% io.ReadAtLeast
0 0% 98.94% 39.87s 98.71% io.ReadFull
0 0% 98.94% 40.03s 99.11% main.main
0 0% 98.94% 39.87s 98.71% os.(*File).Read
0 0% 98.94% 39.87s 98.71% os.(*File).read
0 0% 98.94% 40.21s 99.55% runtime.goexit
0 0% 98.94% 40.03s 99.11% runtime.main
0 0% 98.94% 39.87s 98.71% syscall.Read
Am I reading this right? Is syscall.Syscall basically the main time consumer? Is it where the reading from file is going on?
Upd.
I used bufio.Reader and got this profile:
(pprof) top10
34.16s of 36s total (94.89%)
Dropped 99 nodes (cum <= 0.18s)
Showing top 10 nodes out of 33 (cum >= 0.56s)
flat flat% sum% cum cum%
31.99s 88.86% 88.86% 32s 88.89% syscall.Syscall
0.43s 1.19% 90.06% 0.64s 1.78% runtime.mallocgc
0.39s 1.08% 91.14% 1.06s 2.94% encoding/binary.(*decoder).value
0.28s 0.78% 91.92% 0.99s 2.75% reflect.(*structType).Field
0.28s 0.78% 92.69% 0.28s 0.78% runtime.duffcopy
0.24s 0.67% 93.36% 1.64s 4.56% encoding/binary.sizeof
0.22s 0.61% 93.97% 34.51s 95.86% encoding/binary.Read
0.22s 0.61% 94.58% 0.22s 0.61% runtime.mach_semaphore_signal
0.07s 0.19% 94.78% 1.28s 3.56% reflect.(*rtype).Field
0.04s 0.11% 94.89% 0.56s 1.56% runtime.newobject
答案1
得分: 5
binary.Read
会比较慢,因为它使用了反射(reflection)。我建议使用bufio.Reader
进行基准测试,并手动调用binary.BigEndian
方法来读取你的结构体:
type netFlowRow struct {
Timestamp uint32 // 0
Srcip [4]byte // 4
Dstip [4]byte // 8
Proto uint16 // 12
Srcport uint16 // 14
Dstport uint16 // 16
Pkt uint32 // 18
Size uint64 // 22
}
func main() {
// ...
file, _ := os.Open(path)
r := bufio.NewReader(file)
for j := 0; j < infoRow.Count; j++ {
var buff [4 + 4 + 4 + 2 + 2 + 2 + 4 + 8]byte
if _, err := io.ReadFull(r, buff[:]); err != nil {
panic(err)
}
netRow := netFlowRow{
Timestamp: binary.BigEndian.Uint32(buff[:4]),
// Srcip
// Dstip
Proto: binary.BigEndian.Uint16(buff[12:14]),
Srcport: binary.BigEndian.Uint16(buff[14:16]),
Dstport: binary.BigEndian.Uint16(buff[16:18]),
Pkt: binary.BigEndian.Uint32(buff[18:22]),
Size: binary.BigEndian.Uint64(buff[22:30]),
}
copy(netRow.Srcip[:], buff[4:8])
copy(netRow.Dstip[:], buff[8:12])
// ...
fmt.Printf("%v", netRow)
}
}
英文:
binary.Read
will be slower, due to the fact that it uses reflection. I would suggest bench-marking using bufio.Reader
and manually invoking the binary.BigEndian
methods to read your struct:
type netFlowRow struct {
Timestamp uint32 // 0
Srcip [4]byte // 4
Dstip [4]byte // 8
Proto uint16 // 12
Srcport uint16 // 14
Dstport uint16 // 16
Pkt uint32 // 18
Size uint64 // 22
}
func main() {
// ...
file, _ := os.Open(path)
r := bufio.NewReader(file)
for j := 0; j < infoRow.Count; j++ {
var buff [4 + 4 + 4 + 2 + 2 + 2 + 4 + 8]byte
if _, err := io.ReadFull(r, buff[:]); err != nil {
panic(err)
}
netRow := netFlowRow{
Timestamp: binary.BigEndian.Uint32(buff[:4]),
// Srcip
// Dstip
Proto: binary.BigEndian.Uint16(buff[12:14]),
Srcport: binary.BigEndian.Uint16(buff[14:16]),
Dstport: binary.BigEndian.Uint16(buff[16:18]),
Pkt: binary.BigEndian.Uint32(buff[18:22]),
Size: binary.BigEndian.Uint64(buff[22:30]),
}
copy(netRow.Srcip[:], buff[4:8])
copy(netRow.Dstip[:], buff[8:12])
// ...
fmt.Printf("%v", netRow)
}
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论