英文:
Understanding Linux write performance
问题
我一直在进行一些基准测试,以了解Linux上的写入性能,并且我不理解我得到的结果(我在Ubuntu 17.04上使用ext4,尽管我更感兴趣的是理解ext4,而不是比较文件系统)。
具体来说,我知道一些数据库/文件系统通过保留数据的旧副本,然后将更新写入修改日志来工作。定期地,日志会在旧数据上重放,以获得一个新鲜的数据版本,然后持久化。只有当追加到文件比覆盖整个文件更快时,这才对我有意义(否则为什么要将更新写入日志?为什么不直接覆盖磁盘上的数据?)。我很好奇追加操作比覆盖操作快多少,所以我用Go编写了一个小型基准测试(https://gist.github.com/msteffen/08267045be42eb40900758c419c3bd38),并得到了以下结果:
$ go test ./write_test.go -bench='.*'
BenchmarkWrite/Write_10_Bytes_10_times-8 30 46189788 ns/op
BenchmarkWrite/Write_100_Bytes_10_times-8 30 46477540 ns/op
BenchmarkWrite/Write_1000_Bytes_10_times-8 30 46214996 ns/op
BenchmarkWrite/Write_10_Bytes_100_times-8 3 458081572 ns/op
BenchmarkWrite/Write_100_Bytes_100_times-8 3 678916489 ns/op
BenchmarkWrite/Write_1000_Bytes_100_times-8 3 448888734 ns/op
BenchmarkWrite/Write_10_Bytes_1000_times-8 1 4579554906 ns/op
BenchmarkWrite/Write_100_Bytes_1000_times-8 1 4436367852 ns/op
BenchmarkWrite/Write_1000_Bytes_1000_times-8 1 4515641735 ns/op
BenchmarkAppend/Append_10_Bytes_10_times-8 30 43790244 ns/op
BenchmarkAppend/Append_100_Bytes_10_times-8 30 44581063 ns/op
BenchmarkAppend/Append_1000_Bytes_10_times-8 30 46399849 ns/op
BenchmarkAppend/Append_10_Bytes_100_times-8 3 452417883 ns/op
BenchmarkAppend/Append_100_Bytes_100_times-8 3 458258083 ns/op
BenchmarkAppend/Append_1000_Bytes_100_times-8 3 452616573 ns/op
BenchmarkAppend/Append_10_Bytes_1000_times-8 1 4504030390 ns/op
BenchmarkAppend/Append_100_Bytes_1000_times-8 1 4591249445 ns/op
BenchmarkAppend/Append_1000_Bytes_1000_times-8 1 4522205630 ns/op
PASS
ok command-line-arguments 52.681s
这给我留下了两个问题,我无法想出答案:
1)为什么当我从100次写入增加到1000次时,每个操作的时间会大幅增加?(我知道Go会为我重复执行基准测试,所以自己进行多次写入可能是愚蠢的,但由于我得到了奇怪的答案,我想了解原因)这是由于Go测试中的一个错误(现在已经修复)。
2)为什么追加到文件不比写入文件快?我认为更新日志的整个目的是利用追加的相对速度?(请注意,当前的基准测试在每次写入后调用Sync()
,但即使我不这样做,追加操作也不比写入操作快,尽管两者都比总体速度快得多)
如果这里的任何专家能够给我启示,我将非常感激!谢谢!
英文:
I've been doing some benchmarking to try and understand write performance on Linux, and I don't understand the results I got (I'm using ext4 on Ubuntu 17.04, though I'm more interested in understanding ext4 if anything, than I am in comparing filesystems).
Specifically, I understand that some databases/filesystems work by keeping a stale copy of your data, and then writing updates to a modification log. Periodically, the log is replayed over the stale data to get a fresh version of the data which is then persisted. This only makes sense to me if appending to a file is faster than overwriting the whole file (otherwise why write updates to a log? Why not just overwrite the data on disk?). I was curious how much faster appending is than overwriting, so I wrote a small benchmark in go (https://gist.github.com/msteffen/08267045be42eb40900758c419c3bd38) and got these results:
$ go test ./write_test.go -bench='.*'
BenchmarkWrite/Write_10_Bytes_10_times-8 30 46189788 ns/op
BenchmarkWrite/Write_100_Bytes_10_times-8 30 46477540 ns/op
BenchmarkWrite/Write_1000_Bytes_10_times-8 30 46214996 ns/op
BenchmarkWrite/Write_10_Bytes_100_times-8 3 458081572 ns/op
BenchmarkWrite/Write_100_Bytes_100_times-8 3 678916489 ns/op
BenchmarkWrite/Write_1000_Bytes_100_times-8 3 448888734 ns/op
BenchmarkWrite/Write_10_Bytes_1000_times-8 1 4579554906 ns/op
BenchmarkWrite/Write_100_Bytes_1000_times-8 1 4436367852 ns/op
BenchmarkWrite/Write_1000_Bytes_1000_times-8 1 4515641735 ns/op
BenchmarkAppend/Append_10_Bytes_10_times-8 30 43790244 ns/op
BenchmarkAppend/Append_100_Bytes_10_times-8 30 44581063 ns/op
BenchmarkAppend/Append_1000_Bytes_10_times-8 30 46399849 ns/op
BenchmarkAppend/Append_10_Bytes_100_times-8 3 452417883 ns/op
BenchmarkAppend/Append_100_Bytes_100_times-8 3 458258083 ns/op
BenchmarkAppend/Append_1000_Bytes_100_times-8 3 452616573 ns/op
BenchmarkAppend/Append_10_Bytes_1000_times-8 1 4504030390 ns/op
BenchmarkAppend/Append_100_Bytes_1000_times-8 1 4591249445 ns/op
BenchmarkAppend/Append_1000_Bytes_1000_times-8 1 4522205630 ns/op
PASS
ok command-line-arguments 52.681s
This left me with two questions that I couldn't think of an answer to:
<s>1) Why does time per operation go up so much when I go from 100 writes to 1000? (I know Go repeats benchmarks for me, so doing multiple writes myself is probably silly, but since I got a weird answer I'd like to understand why)</s> This was due to a bug in the Go test (which is now fixed)
- Why isn't appending to a file faster than writing to it? I thought the whole point of the update log was to take advantage of the comparative speed of appends? (note that the current bench calls
Sync()
after every write, but even if I don't do that appends are no faster than writes, though both are much faster overall)
If any of the experts here could enlighten me, I would really appreciate it! Thanks!
答案1
得分: 1
关于(1),我认为问题与您的基准测试不符合Go工具的预期有关。
根据文档(https://golang.org/pkg/testing/#hdr-Benchmarks):
基准测试函数必须运行目标代码 b.N 次。在基准测试执行期间,b.N 会进行调整,直到基准测试函数持续的时间足够长,可以可靠地计时。
我没有看到您的代码使用 b.N
,所以尽管基准测试工具认为您运行了代码 b.N
次,但您实际上是通过自己管理重复次数。根据实际使用的 b.N
值,结果将会出现意外变化。
您实际上可以执行10次、100次和1,000次等操作,但在所有情况下,都要执行 b.N
次操作(可以是 b.N * 10
、b.N * 100
等),以便正确调整报告的基准测试结果。
关于(2),当某些系统更倾向于使用顺序日志来存储操作并进行重放时,并不是因为将数据追加到文件比覆盖单个文件更快。
在数据库系统中,如果您需要更新特定记录,您首先必须找到需要更新的实际文件(以及文件中的位置)。
这可能需要多次索引查找,并且一旦您更新了记录,您可能需要更新这些索引以反映新的值。
因此,正确的比较是追加到单个日志与进行多次读取和多次写入之间的比较。
英文:
About (1), I think the issue is related to your benchmarks not doing what the Go tools expect them to do.
From the documentation (https://golang.org/pkg/testing/#hdr-Benchmarks):
> The benchmark function must run the target code b.N times. During benchmark execution, b.N is adjusted until the benchmark function lasts long enough to be timed reliably.
I don't see your code using b.N
, so while the benchmark tool thinks you run the code b.N
times, you are managing the repeats by yourself. Depending on the values the tools are actually using for b.N
, the results will vary unexpectedly.
You can actually do things 10, 100 and 1,000 times, but in all cases do them b.N
times (make that b.N * 10
, b.N * 100
, etc) so that the reported benchmark is adjusted properly.
About (2), when some systems rather use a sequential log to store operations to the replay them, it's not because appending to a file is faster than overwriting a single file.
In a database system, if you need to update a specific record, you must first find what's the actual file (and position in the file) you need to update.
That might require several index lookups, and once you update the record, you might need to update those indexes to reflect the new values.
So the right comparison is appending to a single log vs making several reads plus then several writes.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论