如何使用通道或goroutine将文件夹下的文件移动到子文件夹中?

huangapple go评论66阅读模式
英文:

How can I move files under a folder into subfolders by using channel or goroutine

问题

我有一个包含多种类型文件的文件夹(在这个简单的例子中没有子文件夹)。假设它包含20000个.raw文件和20000个.jpg文件。我需要将.raw文件移动到raw文件夹中,将.jpg文件移动到jpg文件夹中。所以我尝试使用Golang来解决这个问题:

package main

import (
	"flag"
	"fmt"
	"io/fs"
	"io/ioutil"
	"os"
	"runtime"
	"strings"
	"sync"
	"time"
)

func CreateFolder(basePath string, folderName string) {
	os.Mkdir(basePath+"/"+folderName, 0755)
}

func MoveFile(file string, path string, folder string) {
	err := os.Rename(path+"/"+file, path+"/"+folder+"/"+file)
	if err != nil {
		panic(err)
	}
}

func getInfo(a fs.FileInfo, c chan string) {
	if a.IsDir() || strings.HasPrefix(a.Name(), ".") {
		return
	} else {
		c <- a.Name()
	}
}

func dealInfo(path string, typeDict *sync.Map, c chan string) {
	for name := range c {
		sp := strings.Split(name, ".")
		suffix := sp[len(sp)-1]

		if _, ok := typeDict.Load(suffix); ok {
			MoveFile(name, path, suffix)
		} else {
			CreateFolder(path, suffix)
			MoveFile(name, path, suffix)
			typeDict.Store(suffix, 1)
		}
	}
}

func main() {
	runtime.GOMAXPROCS(8)
	var (
		filepath = flag.String("p", "", "default self folder")
	)

	flag.Parse()
	fmt.Println(*filepath)
	fmt.Println("==========")
	if *filepath == "" {
		fmt.Println("No valid folder path")
		return
	} else {
		fileinfos, err := ioutil.ReadDir(*filepath)
		stime := time.Now()
		if err != nil {
			panic(err)
		}
		var typeDict sync.Map
		ch := make(chan string, 20)

		for _, fs := range fileinfos {
			go getInfo(fs, ch)
			go dealInfo(*filepath, &typeDict, ch)
		}
		fmt.Println(time.Since(stime))
	}
}

但是它返回一个错误:runtime: failed to create new OS thread。我猜这是由于脚本创建了太多的goroutine导致的?但是我不知道为什么会发生这种情况,因为我认为ch := make(chan string, 20)会限制goroutine的数量。

我还尝试使用wg *sync.WaitGroup,像这样:


getInfo(...) // 使用这个函数将所有文件信息放入一个通道

wg.Add(20)

for i:=0; i<20; i++ {
    go dealInfo(..., &wg)  // 这个新的dealInfo包含了wg.Done()
}

wg.Wait()

但是这会导致deadlock错误。

请问有什么最佳的方法可以并行移动文件吗?非常感谢你的帮助!

英文:

I have a folder which contains multiple types of files (with no subfolders in this simple case). Let's assume it contains 20000 .raw files and 20000 .jpg files. I need to move .raw files into raw folder and .jpg files into jpg folder. So I tired to use golang to solve it:

package main

import (
	&quot;flag&quot;
	&quot;fmt&quot;
	&quot;io/fs&quot;
	&quot;io/ioutil&quot;
	&quot;os&quot;
	&quot;runtime&quot;
	&quot;strings&quot;
	&quot;sync&quot;
	&quot;time&quot;
)

func CreateFolder(basePath string, folderName string) {
	os.Mkdir(basePath+&quot;/&quot;+folderName, 0755)
}

func MoveFile(file string, path string, folder string) {
	err := os.Rename(path+&quot;/&quot;+file, path+&quot;/&quot;+folder+&quot;/&quot;+file)
	if err != nil {
		panic(err)
	}
}

func getInfo(a fs.FileInfo, c chan string) {
	if a.IsDir() || strings.HasPrefix(a.Name(), &quot;.&quot;) {
		return
	} else {
		c &lt;- a.Name()
	}
}

func dealInfo(path string, typeDict *sync.Map, c chan string) {
	for name := range c {
		sp := strings.Split(name, &quot;.&quot;)
		suffix := sp[len(sp)-1]

		if _, ok := typeDict.Load(suffix); ok {
			MoveFile(name, path, suffix)
		} else {
			CreateFolder(path, suffix)
			MoveFile(name, path, suffix)
			typeDict.Store(suffix, 1)
		}
	}
}

func main() {
	runtime.GOMAXPROCS(8)
	var (
		filepath = flag.String(&quot;p&quot;, &quot;&quot;, &quot;default self folder&quot;)
	)

	flag.Parse()
	fmt.Println(*filepath)
	fmt.Println(&quot;==========&quot;)
	if *filepath == &quot;&quot; {
		fmt.Println(&quot;No valid folder path&quot;)
		return
	} else {
		fileinfos, err := ioutil.ReadDir(*filepath)
		stime := time.Now()
		if err != nil {
			panic(err)
		}
		var typeDict sync.Map
		ch := make(chan string, 20)

		for _, fs := range fileinfos {
			go getInfo(fs, ch)
			go dealInfo(*filepath, &amp;typeDict, ch)
		}
		fmt.Println(time.Since(stime))
	}
}

But it returns an error: runtime: failed to create new OS thread. I guess this is due to too much goroutines the script created? But I've no idea why this could happen because I think ch := make(chan string, 20) would limit the number of goroutine.

I also tried to use wg *sync.WaitGroup, like:


getInfo(...) // use this func to put all files info into a channel

wg.Add(20)

for i:=0; i&lt;20; i++ {
    go dealInfo(..., &amp;wg)  // this new dealInfo contains wg.Done()
}

wg.Wait()

But this will cause a deadlock error.

May I know the best way to move files parallel please? Your help is really appreciated!

答案1

得分: 1

这可能有效。

然而,移动操作取决于操作系统和文件系统。

通过NFS并行执行可能不是最优的。你必须进行检查。

列出文件,将其发送到通道以供一些goroutine执行(移动/重命名)的策略是我在这种情况下要尝试的。

goroutine的数量(工作线程)可以作为命令行参数。

英文:

This may work.

However the move operation depends on the Operational System and the Filesystem.

Doing it on parallel may not be optimal via NFS for instance. You must check.

The strategy of list the files, send to channels to be executed (move/rename) by some goroutines is something that I will try in this situation.

The number of goroutines (workers) can be a command line parameter.

huangapple
  • 本文由 发表于 2022年5月6日 01:24:21
  • 转载请务必保留本文链接:https://go.coder-hub.com/72131295.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定