为什么这段 Go 代码的速度与 Python 相当(并且没有更快)?

huangapple go评论77阅读模式
英文:

Why is this Go code the equivalent speed as that of Python (and not much faster)?

问题

我需要为超过1GB的文件计算sha256校验和(按块读取文件),目前我正在使用以下Python代码:

import hashlib
import time

start_time = time.time()

def sha256sum(filename="big.txt", block_size=2 ** 13):
    sha = hashlib.sha256()
    with open(filename, 'rb') as f:
        for chunk in iter(lambda: f.read(block_size), b''):
           sha.update(chunk)
    return sha.hexdigest()

input_file = '/tmp/1GB.raw'
print 'checksum is: %s\n' % sha256sum(input_file)
print 'Elapsed time: %s' % str(time.time() - start_time)

我想尝试一下golang,以便获得更快的结果,但是在尝试以下代码后,它运行得稍微慢了几秒钟:

package main

import (
    "crypto/sha256"
    "fmt"
    "io"
    "math"
    "os"
    "time"
)   

const fileChunk = 8192

func File(file string) string {
    fh, err := os.Open(file)

    if err != nil {
        panic(err.Error())
    }   

    defer fh.Close()

    stat, _ := fh.Stat()
    size := stat.Size()
    chunks := uint64(math.Ceil(float64(size) / float64(fileChunk)))
    h := sha256.New()

    for i := uint64(0); i < chunks; i++ {
        csize := int(math.Min(fileChunk, float64(size-int64(i*fileChunk))))
        buf := make([]byte, csize)
        fh.Read(buf)
        io.WriteString(h, string(buf))
    }   

    return fmt.Sprintf("%x", h.Sum(nil))
}   

func main() {
    start := time.Now()
    fmt.Printf("checksum is: %s\n", File("/tmp/1G.raw"))
    elapsed := time.Since(start)
    fmt.Printf("Elapsed time: %s\n", elapsed)
}

如果可能的话,有什么改进golang代码的想法吗?也许可以使用计算机的所有CPU核心,一个用于读取,另一个用于哈希,有什么想法吗?

更新

根据建议,我正在使用以下代码:

package main

import (
    "crypto/sha256"
    "encoding/hex"
    "fmt"
    "io"
    "os"
    "time"
)

func main() {
    start := time.Now()
    fh, err := os.Open("/tmp/1GB.raw")
    if err != nil {
        panic(err.Error())
    }
    defer fh.Close()

    h := sha256.New()
    _, err = io.Copy(h, fh)
    if err != nil {
        panic(err.Error())
    }
    fmt.Println(hex.EncodeToString(h.Sum(nil)))

    fmt.Printf("Elapsed time: %s\n", time.Since(start))
}

为了测试,我使用以下命令创建了1GB的文件:

# mkfile 1G /tmp/1GB.raw

新版本更快,但提升不是很大,使用通道会怎么样?使用多个CPU/核心是否有助于改进?我希望至少有20%的改进,但不幸的是几乎没有任何收益,几乎没有。

Python的时间结果:

5.867u 0.250s 0:06.15 99.3% 0+0k 0+0io 0pf+0w

编译并执行go二进制文件后,Go的时间结果:

5.687u 0.198s 0:05.93 98.9% 0+0k 0+0io 0pf+0w

还有其他的想法吗?

测试结果

使用下面由@icza提供的使用通道的版本:

Elapsed time: 5.894779733s

使用没有通道的版本:

Elapsed time: 5.823489239s

我以为使用通道会稍微提高一点,但似乎没有。

我在MacBook Pro OS X Yosemite上运行这个程序。使用的go版本是:

go version go1.4.1 darwin/amd64

更新2

runtime.GOMAXPROCS设置为4:

runtime.GOMAXPROCS(4)

使得程序运行更快:

Elapsed time: 5.741511748s

更新3

将块大小更改为8192(与Python版本相同)会得到预期的结果:

...
for b, hasMore := make([]byte, 8192<<10), true; hasMore; {
...

同时只使用runtime.GOMAXPROCS(2)

英文:

I need to calculate sha256 checksums for files over 1GB (read file by chunks), currently I am using python with this:

<!-- language: python -->

import hashlib
import time

start_time = time.time()


def sha256sum(filename=&quot;big.txt&quot;, block_size=2 ** 13):
    sha = hashlib.sha256()
    with open(filename, &#39;rb&#39;) as f:
        for chunk in iter(lambda: f.read(block_size), b&#39;&#39;):
           sha.update(chunk)
    return sha.hexdigest()

input_file = &#39;/tmp/1GB.raw&#39;
print &#39;checksum is: %s\n&#39; % sha256sum(input_file)
print &#39;Elapsed time: %s&#39; % str(time.time() - start_time)

I wanted to give a try to golang thinking I could get faster results, but after trying the following code, it runs a couple of seconds slower:

<!-- language: golang -->

package main

import (
    &quot;crypto/sha256&quot;
    &quot;fmt&quot;
    &quot;io&quot;
    &quot;math&quot;
    &quot;os&quot;
    &quot;time&quot;
)   

const fileChunk = 8192

func File(file string) string {
    fh, err := os.Open(file)

    if err != nil {
        panic(err.Error())
    }   

    defer fh.Close()

    stat, _ := fh.Stat()
    size := stat.Size()
    chunks := uint64(math.Ceil(float64(size) / float64(fileChunk)))
    h := sha256.New()

    for i := uint64(0); i &lt; chunks; i++ {
        csize := int(math.Min(fileChunk, float64(size-int64(i*fileChunk))))
        buf := make([]byte, csize)
        fh.Read(buf)
        io.WriteString(h, string(buf))
    }   

    return fmt.Sprintf(&quot;%x&quot;, h.Sum(nil))
}   

func main() {
    start := time.Now()
    fmt.Printf(&quot;checksum is: %s\n&quot;, File(&quot;/tmp/1G.raw&quot;))
    elapsed := time.Since(start)
    fmt.Printf(&quot;Elapsed time: %s\n&quot;, elapsed)
}

Any idea how to improve the golang code if possible? maybe to use all computer CPU cores, one for reading and other for hashing, any ideas ?

Update

As suggested I am using this code:

<!-- language: golang -->

package main

import (
    &quot;crypto/sha256&quot;
    &quot;encoding/hex&quot;
    &quot;fmt&quot;
    &quot;io&quot;
    &quot;os&quot;
    &quot;time&quot;
)

func main() {
    start := time.Now()
    fh, err := os.Open(&quot;/tmp/1GB.raw&quot;)
    if err != nil {
        panic(err.Error())
    }
    defer fh.Close()

    h := sha256.New()
    _, err = io.Copy(h, fh)
    if err != nil {
        panic(err.Error())
    }
    fmt.Println(hex.EncodeToString(h.Sum(nil)))

    fmt.Printf(&quot;Elapsed time: %s\n&quot;, time.Since(start))
}

For testing I am creating the 1GB file with this:

# mkfile 1G /tmp/1GB.raw

The new version is faster but not that much, what about using channels? could the use of more than one CPU/core could help to improve? I was expecting to have an improvement of at least 20% but unfortunately I am getting almost no gain, is almost nothing.

time result for python

 5.867u 0.250s 0:06.15 99.3%	0+0k 0+0io 0pf+0w

time results for go after compiling (go build) and executing the binary:

 5.687u 0.198s 0:05.93 98.9%	0+0k 0+0io 0pf+0w

Any more ideas?

test results

Using the version using channels posted below on the accepted answer by @icza

Elapsed time: 5.894779733s

Using the version with no channels:

Elapsed time: 5.823489239s

I thought that using channels would increase a little bit but seems to not.

I am running this on a MacBook Pro OS X Yosemite. using go version:

go version go1.4.1 darwin/amd64

update 2

Setting runtime.GOMAXPROCS to 4:

runtime.GOMAXPROCS(4)

Made things faster:

Elapsed time: 5.741511748s

update 3

Changing the chunk size to 8192 (like in the python version) give the expected result:

...
for b, hasMore := make([]byte, 8192&lt;&lt;10), true; hasMore; {
...

Also using only runtime.GOMAXPROCS(2)

答案1

得分: 15

你的解决方案效率相当低,因为你在每次迭代中都创建了新的缓冲区,你只使用它们一次然后就丢弃了。

此外,你将缓冲区(buf)的内容转换为string,然后将该string写入SHA256计算器,再将其转换回字节:这是完全不必要的往返操作。

以下是另一种相当快速的解决方案,请测试其性能:

fh, err := os.Open(file)
if err != nil {
    panic(err.Error())
}   
defer fh.Close()

h := sha256.New()
_, err = io.Copy(h, fh)
if err != nil {
    panic(err.Error())
}   

fmt.Println(hex.EncodeToString(h.Sum(nil)))

简要说明:

io.Copy() 是一个函数,它会从一个 Reader 中读取所有数据(直到达到文件末尾),然后将这些数据写入指定的 Writer。由于SHA256计算器(hash.Hash)实现了Writer接口,而文件(File 或者更准确地说是 *File)实现了Reader接口,所以这个过程非常简单。

一旦所有数据都被写入哈希计算器,hex.EncodeToString() 将简单地将结果(通过 hash.Sum(nil) 获得)转换为可读的十六进制字符串。

最终结论

该程序从硬盘中读取了1GB的数据,并对其进行一些计算(计算其SHA-256哈希值)。由于从硬盘中读取数据是一个相对较慢的操作,与Python解决方案相比,Go版本的性能提升不会很显著。整个运行过程需要几秒钟,这与从硬盘中读取1GB数据所需的时间相同数量级。由于Go和Python解决方案从磁盘中读取数据所需的时间大致相同,你不会看到很大的不同结果。

使用多个Goroutine提高性能的可能性

在读取文件的同时,你可以通过将文件的一部分读入一个缓冲区,开始计算其SHA-256哈希值,并同时读取文件的下一部分来稍微提高性能。一旦完成,将其发送到SHA-256计算器,并同时将下一部分读取到第一个缓冲区中。

但由于从磁盘中读取数据所需的时间比计算其SHA-256摘要(或更新摘要计算器的状态)的时间更长,你不会看到显著的改进。在你的情况下,性能瓶颈始终是将数据读入内存所需的时间。

以下是一个完整的、可运行的解决方案,使用了2个Goroutine,其中一个Goroutine读取文件的一部分,另一个计算先前读取部分的哈希值,当读取Goroutine完成后继续进行哈希计算,从而实现并行读取和哈希计算。

通过通道进行阶段之间的正确同步。正如预期的那样,性能提升仅略高于4%的时间(可能会因CPU和硬盘速度而有所变化),因为哈希计算与磁盘读取时间相比微不足道。如果硬盘的读取速度更快(在SSD上进行测试),性能提升很可能会更高。

因此,完整的程序如下:

package main

import (
	"crypto/sha256"
	"encoding/hex"
	"fmt"
	"hash"
	"io"
	"os"
	"runtime"
	"time"
)

const file = "t:/1GB.raw"

func main() {
	runtime.GOMAXPROCS(2) // 重要:Go 1.4 默认只使用1个!

	start := time.Now()

	f, err := os.Open(file)
	if err != nil {
		panic(err)
	}
	defer f.Close()

	h := sha256.New()

	// 2个通道:用于允许读取到缓冲区b1或b2
	readch1, readch2 := make(chan int, 1), make(chan int, 1)

	// 2个通道:用于允许对b1或b2的内容进行哈希计算
	hashch1, hashch2 := make(chan int, 1), make(chan int, 1)

	// 启动信号:允许读取和哈希计算b1
	readch1 <- 1
	hashch1 <- 1

	go hashHelper(f, h, readch1, readch2, hashch1, hashch2)

	hashHelper(f, h, readch2, readch1, hashch2, hashch1)

	fmt.Println(hex.EncodeToString(h.Sum(nil)))

	fmt.Printf("经过时间:%s\n", time.Since(start))
}

func hashHelper(f *os.File, h hash.Hash, mayRead <-chan int, readDone chan<- int, mayHash <-chan int, hashDone chan<- int) {
	for b, hasMore := make([]byte, 64<<10), true; hasMore; {
		<-mayRead
		n, err := f.Read(b)
		if err != nil {
			if err == io.EOF {
				hasMore = false
			} else {
				panic(err)
			}
		}
		readDone <- 1

		<-mayHash
		_, err = h.Write(b[:n])
		if err != nil {
			panic(err)
		}
		hashDone <- 1
	}
}

注意:

在我的解决方案中,我只使用了2个Goroutine。使用更多的Goroutine没有意义,因为如前所述,磁盘读取速度是瓶颈,已经达到了最大值,因为2个Goroutine将能够随时执行读取操作。

关于同步的说明: 2个Goroutine并行运行。每个Goroutine可以随时使用其本地缓冲区b。通过通道对共享的FileHash进行同步,任何时候只允许1个Goroutine使用Hash,只允许1个Goroutine使用(读取)File

英文:

Your solution is quite inefficient as you're making new buffers in each iteration, you use them once and you just throw them away.

Also you convert the content of your buffer (buf) to string and you write that string to the sha256 calculator which converts it back to bytes: an absolutely unnecessary round-trip.

Here is another quite fast solution, test this for performance:

fh, err := os.Open(file)
if err != nil {
panic(err.Error())
}   
defer fh.Close()
h := sha256.New()
_, err = io.Copy(h, fh)
if err != nil {
panic(err.Error())
}   
fmt.Println(hex.EncodeToString(h.Sum(nil)))

A little explanation:

io.Copy() is a function which will read all the data (until EOF is reached) from a Reader and write all those to the specified Writer. Since the sha256 calculator (hash.Hash) implements Writer and the File (or rather *File) implements Reader, this is as easy as it can be.

Once all the data has been written to the hash, hex.EncodeToString() will simply convert the result (obtained by hash.Sum(nil)) to a human-readable, hex string.

Final Verdict

The program reads 1GB of data from the hard disk and does some calculation with it (calculates its SHA-256 hash). Since reading from the hard disk is a relatively slow operation, the performance gain of the Go version will not be significant compared to the Python solution. The overall run takes a couple of seconds which is in the same order of magnitude as the time required to read 1 GB of data from the hard disk. Since both the Go and the Python solution requires approximately the same amount of time to read the data from the disk, you won't see much different results.

Possibility of Performance Improvements with Multiple Goroutines

There is a slight margin where you can improve performance by reading a chunck of the file into one buffer, start calculating its SHA-256 hash, and at the same time read the next chunck of the file. Once its done, send that to the SHA-256 calculator and at the same time read the next chunk into the first buffer.

But since reading the data from the disk takes more time than calculating its SHA-256 digest (or updating the state of the digest calculator), you won't see significant improvement. The performance bottleneck in your case will always be the time required to read the data into memory.

Here is a complete, runnable solution using 2 goroutines where while 1 goroutine reads a chunk of the file the other calculates hash of a previously read chunk, and when the reading of a goroutine finishes continues with hashing and allowing the other to read in parallel.

Proper synchronization between the phases (reading, hashing) is done with channels. As suspected, the performance gain is just a little over 4% in time (may vary based on CPU and hard disk speed) because the hashing computation is negligible compared to the disk reading time. The performance gain will most likely be higher if the reading speed of the hard disk is greater (test it on SSD).

So the complete program:

package main
import (
&quot;crypto/sha256&quot;
&quot;encoding/hex&quot;
&quot;fmt&quot;
&quot;hash&quot;
&quot;io&quot;
&quot;os&quot;
&quot;runtime&quot;
&quot;time&quot;
)
const file = &quot;t:/1GB.raw&quot;
func main() {
runtime.GOMAXPROCS(2) // Important as Go 1.4 uses only 1 by default!
start := time.Now()
f, err := os.Open(file)
if err != nil {
panic(err)
}
defer f.Close()
h := sha256.New()
// 2 channels: used to give green light for reading into buffer b1 or b2
readch1, readch2 := make(chan int, 1), make(chan int, 1)
// 2 channels: used to give green light for hashing the content of b1 or b2
hashch1, hashch2 := make(chan int, 1), make(chan int, 1)
// Start signal: Allow b1 to be read and hashed
readch1 &lt;- 1
hashch1 &lt;- 1
go hashHelper(f, h, readch1, readch2, hashch1, hashch2)
hashHelper(f, h, readch2, readch1, hashch2, hashch1)
fmt.Println(hex.EncodeToString(h.Sum(nil)))
fmt.Printf(&quot;Elapsed time: %s\n&quot;, time.Since(start))
}
func hashHelper(f *os.File, h hash.Hash, mayRead &lt;-chan int, readDone chan&lt;- int, mayHash &lt;-chan int, hashDone chan&lt;- int) {
for b, hasMore := make([]byte, 64&lt;&lt;10), true; hasMore; {
&lt;-mayRead
n, err := f.Read(b)
if err != nil {
if err == io.EOF {
hasMore = false
} else {
panic(err)
}
}
readDone &lt;- 1
&lt;-mayHash
_, err = h.Write(b[:n])
if err != nil {
panic(err)
}
hashDone &lt;- 1
}
}

Notes:

In my solution I only used 2 goroutines. There is no point using more because as noted before the disk reading speed is the bottleneck which is already used at its maximum as 2 goroutines will be able to perform reading at any time.

Notes on synchronization: 2 goroutines run parallel. Each goroutine is allowed to use its local buffer b at any time. Access to the shared File and to the shared Hash is synchronized by the channels, only 1 goroutine is allowed to use the Hash at any given time, and only 1 goroutine is allowed to use (read) from the File at any given time.

答案2

得分: 0

对于不了解的人,我认为这会有所帮助。

https://blog.golang.org/pipelines

在这个页面的末尾,有一个使用goroutines解决md5文件的方法。

我在自己的“~”目录上尝试了一下,使用goroutines花费了1.7秒,而不使用goroutines花费了2.8秒。

以下是在没有使用goroutines时的时间使用情况。由于所有这些操作同时运行,我不知道如何计算使用goroutines时的时间使用情况。

时间使用 2.805522165秒
读取文件时间 759.476091毫秒
md5时间 1.710393575秒
排序时间 17.355134毫秒
英文:

For someone didn't know, I think this will help.

https://blog.golang.org/pipelines

At the end of this page, there are a solution for md5 files by goroutines.

I try this out on my own "~" dir. use goroutines spend 1.7 second, with out goroutines use 2.8 second.

here is how time use when without goroutines. and I don't know how to calculate time use when using goroutines because all these things runs at the same time.

time use  2.805522165s
time read file  759.476091ms
time md5   1.710393575s
time sort  17.355134ms

huangapple
  • 本文由 发表于 2015年1月15日 22:19:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/27965472.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定