golang os *File.Readdir在所有文件上使用lstat。它可以进行优化吗?

huangapple go评论71阅读模式

golang os *File.Readdir using lstat on all files. Can it be optimised?




package main
import (
func main() {
    x, err := os.Open("/usr/bin")
    if err != nil {
    y, err := x.Readdir(0)
    if err != nil {
    for _, i := range y {


% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.62    0.004110           2      2466           write
  3.46    0.000152           7        22           getdents64
  2.92    0.000128           0      2466           lstat // 这个随着文件数量的增加而增加。
  0.00    0.000000           0        11           mmap
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0       114           rt_sigaction
  0.00    0.000000           0         8           rt_sigprocmask
  0.00    0.000000           0         1           sched_yield
  0.00    0.000000           0         3           clone
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2           sigaltstack
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           gettid
  0.00    0.000000           0        57           futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         1           openat
------ ----------- ----------- --------- --------- ----------------
100.00    0.004390                  5156           total



#include <stdio.h>
#include <dirent.h>

int main (void) {
    DIR* dir_p;
    struct dirent* dir_ent;

    dir_p = opendir ("/usr/bin");

    if (dir_p != NULL) {
        while ((dir_ent = readdir (dir_p)) != NULL) {
            if (dir_ent->d_type == DT_DIR) {
                printf("%s is a directory\n", dir_ent->d_name);
            } else {
                printf("%s is not a directory\n", dir_ent->d_name);
        (void) closedir(dir_p);
        perror ("Couldn't open the directory");

    return 0;


% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000128           0      2468           write
  0.00    0.000000           0         1           read
  0.00    0.000000           0         3           open
  0.00    0.000000           0         3           close
  0.00    0.000000           0         4           fstat
  0.00    0.000000           0         8           mmap
  0.00    0.000000           0         3           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         3         3 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         4           getdents
  0.00    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.000128                  2503         3 total



  • 我想知道是否有可能以避免不必要地对所有文件执行lstat()的方式重新编写go程序。我注意到C程序使用了以下系统调用:open("/usr/bin", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0755, st_size=69632, ...}) = 0 brk(NULL) = 0x1098000 brk(0x10c1000) = 0x10c1000 getdents(3, /* 986 entries */, 32768) = 32752
  • 这是否属于过早优化的范畴,我不应该担心?我提出这个问题是因为要监视的目录中的文件数量将会非常庞大,而且CGO版本之间的系统调用差异几乎是两倍,这将会对磁盘造成影响。

I am writing a program that finds all sub-directories from a parent directory which contains a huge number of files using os.File.Readdir, but running an strace to see the count of systemcalls showed that the go version is using an lstat() on all the files/directories present in the parent directory. (I am testing this with /usr/bin directory for now)

Go code:

package main
import (
func main() {
	x, err := os.Open(&quot;/usr/bin&quot;)
	if err != nil {
	y, err := x.Readdir(0)
	if err != nil {
	for _, i := range y {


Strace on the program (without following threads):

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.62    0.004110           2      2466           write
  3.46    0.000152           7        22           getdents64
  2.92    0.000128           0      2466           lstat // this increases with increase in no. of files.
  0.00    0.000000           0        11           mmap
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0       114           rt_sigaction
  0.00    0.000000           0         8           rt_sigprocmask
  0.00    0.000000           0         1           sched_yield
  0.00    0.000000           0         3           clone
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         2           sigaltstack
  0.00    0.000000           0         1           arch_prctl
  0.00    0.000000           0         1           gettid
  0.00    0.000000           0        57           futex
  0.00    0.000000           0         1           sched_getaffinity
  0.00    0.000000           0         1           openat
------ ----------- ----------- --------- --------- ----------------
100.00    0.004390                  5156           total

I tested the same with the C's readdir() without seeing this behaviour.

C code:

#include &lt;stdio.h&gt;
#include &lt;dirent.h&gt;

int main (void) {
    DIR* dir_p;
    struct dirent* dir_ent;

    dir_p = opendir (&quot;/usr/bin&quot;);

    if (dir_p != NULL) {
        // The readdir() function returns a pointer to a dirent structure representing the next
        // directory entry in the directory stream pointed to by dirp.
        // It returns NULL on reaching the end of the directory stream or if an error occurred.
        while ((dir_ent = readdir (dir_p)) != NULL) {
            // printf(&quot;%s&quot;, dir_ent-&gt;d_name);
            // printf(&quot;%d&quot;, dir_ent-&gt;d_type);
            if (dir_ent-&gt;d_type == DT_DIR) {
                printf(&quot;%s is a directory&quot;, dir_ent-&gt;d_name);
            } else {
                printf(&quot;%s is not a directory&quot;, dir_ent-&gt;d_name);

            (void) closedir(dir_p);

        perror (&quot;Couldn&#39;t open the directory&quot;);

    return 0;

Strace on the program:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.000128           0      2468           write
  0.00    0.000000           0         1           read
  0.00    0.000000           0         3           open
  0.00    0.000000           0         3           close
  0.00    0.000000           0         4           fstat
  0.00    0.000000           0         8           mmap
  0.00    0.000000           0         3           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         3           brk
  0.00    0.000000           0         3         3 access
  0.00    0.000000           0         1           execve
  0.00    0.000000           0         4           getdents
  0.00    0.000000           0         1           arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.000128                  2503         3 total

I am aware that the only fields in the dirent structure that are mandated by POSIX.1 are d_name and d_ino, but I am writing this for a specific filesystem.

Tried *File.Readdirnames(), which doesn't use an lstat and gives a list of all files and directories, but to see if the returned string is a file or a directory will eventually do an lstat again.

  • I was wondering if it is possible to re-write the go program in a way to avoid the lstat() on all the files un-necessarily. I could see the C program is using the following syscalls. open(&quot;/usr/bin&quot;, O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
    fstat(3, {st_mode=S_IFDIR|0755, st_size=69632, ...}) = 0
    brk(NULL) = 0x1098000
    brk(0x10c1000) = 0x10c1000
    getdents(3, /* 986 entries */, 32768) = 32752
  • Is this something like a premature optimisation, which I shouldn't be worried about? I raised this question because the number of files in the directory being monitored will be having huge number of small archived files, and the difference in systemcalls is almost twice between C and GO version, which will be hitting the disk.


得分: 7


package main

import (


func int8ToString(s []int8) string {
	var buff bytes.Buffer
	for _, chr := range s {
		if chr == 0x00 {
	return buff.String()

func main() {
	stream, err := dirent.Open("/usr/bin")
	if err != nil {
	defer stream.Close()
	for {
		entry, err := stream.Read()
		if err != nil {
			if err == io.EOF {

		name := int8ToString(entry.Name[:])
		if entry.Type == unix.DT_DIR {
			fmt.Printf("%s 是一个目录\n", name)
		} else {
			fmt.Printf("%s 不是一个目录\n", name)

The package dirent looks like it accomplishes what you are looking for. Below is your C example written in Go:

package main

import (


func int8ToString(s []int8) string {
	var buff bytes.Buffer
	for _, chr := range s {
		if chr == 0x00 {
	return buff.String()

func main() {
	stream, err := dirent.Open(&quot;/usr/bin&quot;)
	if err != nil {
	defer stream.Close()
	for {
		entry, err := stream.Read()
		if err != nil {
			if err == io.EOF {

		name := int8ToString(entry.Name[:])
		if entry.Type == unix.DT_DIR {
			fmt.Printf(&quot;%s is a directory\n&quot;, name)
		} else {
			fmt.Printf(&quot;%s is not a directory\n&quot;, name)


得分: 2

从Go 1.16(2021年2月)开始,一个很好的选择是使用os.ReadDir

package main
import "os"

func main() {
   files, e := os.ReadDir(".")
   if e != nil {
   for _, file := range files {




Starting with Go 1.16 (Feb 2021), a good option is os.ReadDir:

package main
import &quot;os&quot;

func main() {
   files, e := os.ReadDir(&quot;.&quot;)
   if e != nil {
   for _, file := range files {

os.ReadDir returns fs.DirEntry instead of fs.FileInfo, which means that
Size and ModTime methods are omitted, making the process more efficient.


  • 本文由 发表于 2017年1月2日 05:07:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/41419056.html



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
