如何获取目录的总大小?

huangapple go评论82阅读模式
英文:

How to get directory total size?

问题

所以我正在尝试使用Go语言获取目录的总大小。到目前为止,我有以下代码:

var dirSize int64 = 0

func readSize(path string, file os.FileInfo, err error) error {
    if !file.IsDir() {
        dirSize += file.Size()
    }
    return nil
} 

func DirSizeMB(path string) float64 {
    dirSize = 0
    filepath.Walk(path, readSize)
    sizeMB := float64(dirSize) / 1024.0 / 1024.0
    sizeMB = Round(sizeMB, .5, 2)
    return sizeMB
}

问题是dirSize全局变量是否会引起问题,如果会,我该如何将其移动到DirSizeMB函数的作用域中?

英文:

So I am trying to get total size of a directory using Go. So far I have this:

var dirSize int64 = 0

func readSize(path string, file os.FileInfo, err error) error {
    if !file.IsDir() {
	    dirSize += file.Size()
    }
    return nil
} 

func DirSizeMB(path string) float64 {
    dirSize = 0
    filepath.Walk(path, readSize)
    sizeMB := float64(dirSize) / 1024.0 / 1024.0
    sizeMB = Round(sizeMB, .5, 2)
    return sizeMB
}

The question is whether the dirSize global variable is going to cause problems and if it does, how do I move it to the scope of the DirSizeMB function?

答案1

得分: 43

使用全局变量是不好的做法,尤其是在DirSizeMB被并发调用时会导致竞态条件。

简单的解决方案是使用闭包,例如:

func DirSize(path string) (int64, error) {
    var size int64
    err := filepath.Walk(path, func(_ string, info os.FileInfo, err error) error {
        if err != nil {
            return err
        }
        if !info.IsDir() {
            size += info.Size()
        }
        return err
    })
    return size, err
}

Playground

如果你认为这样看起来更好,你可以将闭包赋值给一个变量。

英文:

Using a global like that at best is bad practice.
It's also a race if DirSizeMB is called concurrently.

The simple solution is to use a closure, e.g.:

func DirSize(path string) (int64, error) {
	var size int64
	err := filepath.Walk(path, func(_ string, info os.FileInfo, err error) error {
		if err != nil {
			return err
		}
		if !info.IsDir() {
			size += info.Size()
		}
		return err
	})
	return size, err
}

<kbd>Playground</kbd>

You could assign the closure to a variable if you think that looks better.

答案2

得分: 0

如果你想使用一个变量,你可以这样做:

func DirSizeMB(path string) float64 {
    var dirSize int64 = 0

    readSize := func(path string, file os.FileInfo, err error) error {
        if !file.IsDir() {
            dirSize += file.Size()
        }

        return nil
    }

    filepath.Walk(path, readSize)    

    sizeMB := float64(dirSize) / 1024.0 / 1024.0

    return sizeMB
}

如果你想使用一个变量,你可以这样做:

func DirSizeMB(path string) float64 {
    var dirSize int64 = 0

    readSize := func(path string, file os.FileInfo, err error) error {
        if !file.IsDir() {
            dirSize += file.Size()
        }

        return nil
    }

    filepath.Walk(path, readSize)    

    sizeMB := float64(dirSize) / 1024.0 / 1024.0

    return sizeMB
}
英文:

If you want to use a variable, you can do this:

func DirSizeMB(path string) float64 {
	var dirSize int64 = 0

	readSize := func(path string, file os.FileInfo, err error) error {
		if !file.IsDir() {
			dirSize += file.Size()
		}

		return nil
	}

	filepath.Walk(path, readSize)    

	sizeMB := float64(dirSize) / 1024.0 / 1024.0

	return sizeMB
}

答案3

得分: -1

你可以在DirSizeMB函数内部定义一个通道,并在该函数内部定义readSize函数,以便它可以获取通道作为闭包。然后将所有的大小发送到通道中,并在接收到它们时进行求和。

func DirSizeMB(path string) float64 {
    sizes := make(chan int64)
    readSize := func(path string, file os.FileInfo, err error) error {
        if err != nil || file == nil {
            return nil // 忽略错误
        }
        if !file.IsDir() {
            sizes <- file.Size()
        }
        return nil
    }

    go func() {
        filepath.Walk(path, readSize)
        close(sizes)
    }()

    size := int64(0)
    for s := range sizes {
        size += s
    }

    sizeMB := float64(size) / 1024.0 / 1024.0

    sizeMB = Round(sizeMB, 0.5, 2)

    return sizeMB
}

为什么要使用通道?

除非你已经阅读了底层代码,否则你实际上不知道filepath.Walk如何调用你的readSize函数。虽然它可能按顺序调用给定路径上的所有文件,但实现理论上可以在单独的goroutine上同时调用多个这样的调用(如果是这样,文档可能会提到)。无论如何,在设计用于并发的语言中,确保代码安全是一个好的实践。

@DaveC给出的答案展示了如何通过使用闭包来解决全局变量的问题,因此对DirSize的多个同时调用是安全的。Walk的文档明确说明了遍历函数以确定的顺序运行文件,所以他的解决方案对于这个问题已经足够了,但我将其作为一个示例,展示了如何使内部函数能够安全地并发运行。

英文:

One thing you could do would be to define a channel inside of DirSizeMB, and define readSize inside of that function so it gets the channel as a closure. Then send all of the sizes out the channel and sum them as you receive them.

func DirSizeMB(path string) float64 {
	sizes := make(chan int64)
	readSize := func(path string, file os.FileInfo, err error) error {
		if err != nil || file == nil {
			return nil // Ignore errors
		}
		if !file.IsDir() {
			sizes &lt;- file.Size()
		}
		return nil
	}

	go func() {
		filepath.Walk(path, readSize)
		close(sizes)
	}()

	size := int64(0)
	for s := range sizes {
		size += s
	}

	sizeMB := float64(size) / 1024.0 / 1024.0

	sizeMB = Round(sizeMB, 0.5, 2)

	return sizeMB
}

http://play.golang.org/p/zzKZu0cm9n

Why use a channel?

Unless you've read the underlying code, you don't actually know how filepath.Walk invokes your readSize function. While it probably calls it sequentially over all of the files on the given path, the implementation could theoretically invoke several of these calls simultaneously on separate goroutines (the docs would probably mention this if it did). In any case, in a language designed for concurrency, it's good practice to make sure that your code is safe.

The answer that @DaveC gives shows how to do this by using a closure over a local variable solves the problem of having a global variable, so multiple simultaneous calls to DirSize would be safe. The Docs for Walk explicitly state that the walk function runs over files in a deterministic order, so his solution is sufficient for this problem, but I'll leave this as an example of how to make it safe to run the inner function concurrently.

huangapple
  • 本文由 发表于 2015年9月9日 22:43:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/32482673.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定