`file-block-size`参数在sysbench文件I/O测试中更改时,衡量的是什么?

huangapple go评论44阅读模式
英文:

What is measured by changing the `file-block-size` parameter in sysbench fileio test?

问题

我试图使用sysbench的文件IO测试来测量系统性能。然而,当我改变file-block-size参数时,我不确定我在做什么。

以前我认为它是文件系统块大小,但后来我看了一下代码,发现它实际上是文件系统块大小外部的一个包装器。sysbench在文件IO测试中读取文件的伪代码如下(主要来自sb_fileio.c):

while current_pointer < file_leng:
    read_leng = min(file_block_size, file_leng - current_pointer)
    pread(fd, read_buf, read_leng, current_pointer)
    current_pointer += read_leng

Sysbench在这里使用了pread,这是文件系统实现的一个系统调用。当文件大小小于file_block_size时,这个参数没有意义,因为读取大小总是小于我们提供的file_block_size,并且实际在pread中使用的块大小(即从磁盘加载到内存的字节数,即使我们只想读取1字节)已经由文件系统(如果不是硬件的话)定义了。

例如,假设pread使用的文件系统块大小为4K。当sysbench的file_block_size为1K/2K/4K时,每个pread系统调用将获取4K/4K/4K的块;当sysbench设置file_block_size = 1024Kfile_size = 1024K时,每个pread系统调用将获取256个4K块(而不是1个1024K块);但当file_block_size = 1024Kfile_size = 16K时,传递给pread的读取长度将始终为16K,并且将检索4个4K块,因为它使用的是min(file_size, file_block_size),就是这样。

我的理解正确吗?如果是这样,那么通过更改该参数,我实际上在做什么?或者我是否应该始终将file_size设置为比file_block_size更大?

此外,当加载1024K时,sysbench实际上在pread系统调用内部加载了256个4K块,而不是整个1024K - 这两种行为之间是否应该有性能(吞吐量/延迟)差异?

=====

我使用的命令:

./sysbench --file-block-size=<file_block_size> --file-total-size=65536K --file_num=<file_num> --file-test-mode=rndrd --file-fsync-all=on --file-extra-flags=direct fileio <prepare/run/cleanup>

file_block_size取值为{1K,4K,16K,256K,1024K等},file_num取值为{1,4,16,...,65536} ==> 单个文件大小为{65536K,16384K,...,1K}。我得到的结果是:

不同file_block_size下文件大小(K)的延迟(微秒)

在这里,file_block_size为256K的16K文件的延迟要比file_block_size为256K的256K文件要低得多。如果file_block_size是硬件的加载单元大小,那么不应该出现这种情况,所以它不是文件系统块大小(我使用的是4K块大小的ext2/ext3文件系统)。那它是什么呢?

英文:

I was trying to measure the system performance with sysbench fileio test. However, I'm not sure what am I playing with when I change that file-block-size parameter.

Previously I thought it was the file system block size, but then I looked at the code and it is actually a wrapper outside the file system block size. The pseudo code of sysbench reading a file in the fileio test is as follows (mainly comes from the sb_fileio.c):

while current_pointer < file_leng:
    read_leng = min(file_block_size, file_leng - current_pointer)
    pread(fd, read_buf, read_leng, current_pointer)
    current_pointer += read_leng

Sysbench is using pread, a syscall implemented by the file system here. When the file size is smaller than file_block_size, that parameter makes no sense as the read size will always be smaller than the file_block_size we gave it, and the actual block size used in pread (i.e. how many bytes we have to load from disk to memory even we just want to read 1 byte) is already defined by the file system (if not hardware).

For example, supposing the file system block size used by pread is 4K. When sysbench file_block_size is 1K/2K/4K, each pread syscall will get us a 4K/4K/4K block; when sysbench set file_block_size = 1024K and file_size = 1024K, each pread syscall will get us 256*4K blocks (instead of 1*1024K block); but when file_block_size = 1024K and file_size = 16K, the read length sent to pread will always be just 16K, and instead of retrieving 1024K (= 256 * 4K), it will retrieve 4*4K blocks as it is using the min(file_size, file_block_size) and that's it.

Is my understanding right? If so, what am I actually playing with by changing that parameter? Or am I supposed to always set the file_size to be bigger that that file_block_size?

Also, when loading 1024K, sysbench is actually loading 256 * 4K block inside the pread syscall, but not that 1024K as a whole - should there be any performance (throughput/latency) difference between these two behaviors?

=====

The command I used:

./sysbench --file-block-size=<file_block_size> --file-total-size=65536K --file_num=<file_num> --file-test-mode=rndrd --file-fsync-all=on --file-extra-flags=direct fileio <prepare/run/cleanup>

The file_block_size is in {1K, 4K, 16K, 256K, 1024K, etc.}, the file_num is in {1, 4, 16, ..., 65536} ==> single file size is in {65536K, 16384K, ..., 1K}. The result I get:

Latency (us) over file size (K) with different file_block_sizes

Here 16K files with 256K file_block_size is having much lower latency than 256K files with 256K file_block_size. That should not be the case if the file_block_size is the load unit size of hardware, so it is not the file system block size (I have an ext2/ext3 file system with 4K block size). Then what it is?

答案1

得分: 0

以下是翻译好的部分:

首先,似乎文件块大小实际上是在执行IO操作之前的缓冲区大小。换句话说,它是用于读取(或写入)填充的缓冲区大小。

因此,如果您有一个4k的文件系统块大小,256k的文件块大小和1024k的文件大小,那么所发生的情况是您发出了4个单独的256k读取请求。在底层调度器通常可以优化的情况下,阅读较大的范围通常比阅读小数量的数据更有效,您还会生成较少的请求和后续的系统调用,减少排队等。

第二个观察是,当块大小大于文件大小时,似乎会在内部静默执行某些保护措施。可能的情况是,该实用程序只是一次性读取整个文件。这可以解释为什么具有256k块大小的16k文件比256k文件更快,简单地因为您读取的数据量少32倍。

最后,为了总结并希望能够回答您关于文件大小与块参数的初始问题,一旦文件大小<=块大小,您只会看到一个读取请求。这本质上并没有错,但根据他们试图测量什么,人们应该意识到这一点。

英文:

A few different observations.

First, it appears what the fileblock size actually is, is a buffer size before issuing an IO operation. In other words, its the size of the buffer being used for a read (or write) to be filled.

So if you have a 4k file system block size, and a 256k file block size, and 1024k file size, what is happening is you are issuing 4 separate 256k reads. Each of those reads "under the hood" reads a range of 4k file system blocks (often called an extent). Reading in larger extents is usually more efficient than reading in small amounts as the lower level schedulers can optimize things, you also are generating fewer requests and subsequent system calls, less to queue up etc.

Second observation, it looks like there is some guardrails that are silently doing things internally when you have a block size larger than a file size. Presumably what is happening is that the utility is simply reading the entire file in one go. This would explain why you a 16k file with a 256k block size is faster than a 256k file, simply because you're reading 32 times less data.

And to wrap up and hopefully answer your original question about the file size vs block parameter, once the file size is <= the block size, you're only going to see a single read request. This isn't inherently wrong, but one should be aware of it depending on what they're trying to measure.

huangapple
  • 本文由 发表于 2023年6月1日 09:59:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/76378235.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定