英文:
Why is writing files witht syscall.O_DIRECT flag make writing files slower in go?
问题
我有一个名为test.go
的小代码片段。它在执行两次写入操作时计算时间(以纳秒为单位),这两次写入操作将相同的字节切片写入两个文件,一个文件使用标志syscall.O_DIRECT
,另一个文件不使用。
以下是代码:
package main;
import (
"os"
"time"
"fmt"
"strconv"
"bytes"
"syscall"
// "os/exec"
)
func main() {
num, _ := strconv.Atoi(os.Args[1]);
writeContent:= bytes.Repeat( ([]byte)("1"), num );
t1:= time.Now().UnixNano();
fd1, err := syscall.Open("abc.txt", syscall.O_WRONLY | syscall.O_DIRECT | syscall.O_TRUNC, 0);
syscall.Write(fd1, writeContent);
if err != nil {panic(err);}
t2:= time.Now().UnixNano();
fmt.Println("sysW1:", t2-t1);
t1= time.Now().UnixNano();
fd2, err := syscall.Open("abc.txt", syscall.O_WRONLY | syscall.O_TRUNC, 0);
syscall.Write(fd2, writeContent);
if err != nil {panic(err);}
t2= time.Now().UnixNano();
fmt.Println("sysW2:", t2-t1);
}
在Linux命令行中运行该程序的方式如下(在使用go build ./test.go
编译之后):
./test 1024
我本来期望使用syscall.O_DIRECT
标志写入文件会更快,但结果显示,使用syscall.O_DIRECT
标志写入文件的速度大约比不使用该标志的写入速度慢30倍
结果如下:
sysW1: 1107377
sysW2: 37155
为什么会这样?我原以为使用syscall.O_DIRECT
进行写入操作会减少复制操作并且更快,但现在结果显示它实际上比较慢。请帮我解释一下
附注:由于某种原因,在Playground上运行该程序时结果始终为0,因此我不会提供Playground链接。
英文:
I've got a small peice of code named test.go
. It counts time(ns) when doing two writings that write a same byte slice to 2 files, one with the flag syscall.O_DIRECT
and the other not.
The code is below:
package main;
import (
"os"
"time"
"fmt"
"strconv"
"bytes"
"syscall"
// "os/exec"
)
func main() {
num, _ := strconv.Atoi(os.Args[1]);
writeContent:= bytes.Repeat( ([]byte)("1"), num );
t1:= time.Now().UnixNano();
fd1, err := syscall.Open("abc.txt", syscall.O_WRONLY | syscall.O_DIRECT | syscall.O_TRUNC, 0);
syscall.Write(fd1, writeContent);
if err != nil {panic(err);}
t2:= time.Now().UnixNano();
fmt.Println("sysW1:", t2-t1);
t1= time.Now().UnixNano();
fd2, err := syscall.Open("abc.txt", syscall.O_WRONLY | syscall.O_TRUNC, 0);
syscall.Write(fd2, writeContent);
if err != nil {panic(err);}
t2= time.Now().UnixNano();
fmt.Println("sysW2:", t2-t1);
}
The program is runned in linux command line like this:(after being compiled with go build ./test.go
)
./test 1024
I had expected writing file with syscall.O_DIRECT
flag to be faster, but the result showed that writing files with syscall.O_DIRECT
flag was about 30 times slower than writing without it
result:
sysW1: 1107377
sysW2: 37155
Why? I tought writing with syscall.O_DIRECT does less copying and would be faster, but it now turns out to be much slower. Please help me explain it
PX: I will not provide playground link since the result is always 0 when running the program on the playground in some reason.
答案1
得分: 1
O_DIRECT
并不是你想象的那样。虽然它减少了内存拷贝的次数(因为它在拷贝到设备驱动程序之前不会拷贝到缓存中),但这并不能提高性能。
文件系统缓存确保系统调用在数据写入设备之前可以提前返回,并且缓冲数据以便以更大的块发送数据。
使用O_DIRECT
时,系统调用会等待数据完全传输到设备。
根据open
调用的man页面:
O_DIRECT
(自Linux 2.4.10起)尝试最小化对该文件的I/O的缓存效果。总的来说,这会降低性能,但在特殊情况下很有用,比如应用程序自己进行缓存时。文件I/O直接在用户空间缓冲区中进行。
O_DIRECT
标志本身会尽力进行同步数据传输,但不像O_SYNC
标志那样提供数据和必要元数据的传输保证。
另请参阅:https://stackoverflow.com/questions/41257656/what-does-o-direct-really-mean
在使用完缓存后,你不需要手动释放缓存。缓存被Linux内核视为可用的空闲内存。如果一个进程需要被缓存占用的内存,内核会在那时刷新/释放缓存。缓存不会“占用”内存。
英文:
O_DIRECT
doesn't do what you think. While it does less memory copying (since it doesn't copy to the cache before copying to the device driver), that doesn't give you a performance boost.
The filesystem cache ensures that the system call can return early before the data is written to the device, and buffer data to send data in larger chunks.
With O_DIRECT
, the system call waits until the data is completely transferred to the device.
From the man page for the open
call:
> O_DIRECT
(since Linux 2.4.10)
>
> Try to minimize cache effects of the I/O to and from this
> file. In general this will degrade performance, but it is
> useful in special situations, such as when applications do
> their own caching. File I/O is done directly to/from
> user-space buffers. The O_DIRECT
flag on its own makes an
> effort to transfer data synchronously, but does not give
> the guarantees of the O_SYNC
flag that data and necessary
> metadata are transferred.
See also: https://stackoverflow.com/questions/41257656/what-does-o-direct-really-mean
You don't need to manually release the cache after using it.
The cache is considered free available memory by the Linux kernel. If a process needs memory that is occupied by the cache, the kernel will flush/release the cache at that point. The cache doesn't "use up" memory.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论