How to extract .7z files in Go

huangapple go评论89阅读模式
英文:

How to extract .7z files in Go

问题

我有一个包含多个.txt文件的7z压缩文件。我想要列出压缩文件中的所有文件,并将它们上传到S3存储桶。但是我在Go语言中提取.7z压缩文件时遇到了问题。为了解决这个问题,我找到了一个名为github.com/gen2brain/go-unarr的包(导入为extractor),以下是我目前的代码:

content, err := ioutil.ReadFile("sample_archive.7z")
if err != nil {
    fmt.Printf("err: %+v", err)
}

a, err := extractor.NewArchiveFromMemory(content)
if err != nil {
    fmt.Printf("err: %+v", err)
}

lst, _ := a.List()
fmt.Printf("lst: %+v", lst)

这段代码会打印出压缩文件中的所有文件列表。但是它存在两个问题。

首先,它使用ioutil从本地读取文件,而NewArchiveFromMemory的输入参数必须是[]byte类型。但是我不能从本地读取文件,而是需要使用内存中的os.file类型的文件。所以我要么需要找到另一种方法,要么将os.file转换为[]byte。另外,还有一个方法NewArchiveFromReader(r io.Reader),但是它返回一个错误,错误信息是Bad File Descriptor

file, err := os.OpenFile(
    path,
    os.O_WRONLY|os.O_TRUNC|os.O_CREATE,
    0666,
)

a, err := extractor.NewArchiveFromReader(file)
if err != nil {
    fmt.Printf("ERROR: %+v", err)
}

lst, _ := a.List()
fmt.Printf("files: %+v\n", lst)

通过上述代码,我能够获取压缩文件中的文件列表。并且使用Extract(destination_path string)方法,我也可以将文件提取到本地目录。但是我希望提取后的文件也以os.file格式保存(即一个os.file列表,因为可能有多个文件)。

请问如何修改我的当前代码以实现上述两个目标?还有其他的库可以完成这个任务吗?

英文:

I have a 7z archive of a number of .txt files. I am trying to list all the files in the archive and upload them to an s3 bucket. But I'm having trouble with extracting .7z archives on Go. To do this, I found a package github.com/gen2brain/go-unarr (imported as extractor) and this is what I have so far

		content, err := ioutil.ReadFile("sample_archive.7z")
		if err != nil {
			fmt.Printf("err: %+v", err)
		}

		a, err := extractor.NewArchiveFromMemory(content)
		if err != nil {
			fmt.Printf("err: %+v", err)
		}

		lst, _ := a.List()
		fmt.Printf("lst: %+v", last)

This prints a list of all the files in the archive. But this has two issues.

It reads files from local using ioutil and the input of NewArchiveFromMemory must be of type []byte. But I can't read from local and will have to use a file from memory of type os.file. So I will either have to find a different method or convert the os.file to []byte. There's another method NewArchiveFromReader(r io.Reader). But this is returning an error saying Bad File Descriptor.

file, err := os.OpenFile(
	path,
	os.O_WRONLY|os.O_TRUNC|os.O_CREATE,
	0666,
)

a, err := extractor.NewArchiveFromReader(file)
if err != nil {
	fmt.Printf("ERROR: %+v", err)
}
    
lst, _ := a.List()
fmt.Printf("files: %+v\n", lst)

I am able to get the list of the files in the archive. And using Extract(destinaltion_path string), I can also extract it to a local directory. But I want the extracted files also in os.file format ( ie. a list of os.file since there will be multiple files ).

How can I change my current code to achieve both the above targets? Is there any other library to do this?

答案1

得分: 2

  1. os.File 实现了 io.Reader 接口(因为它定义了 Read([]byte) (int, error) 方法),所以你可以直接使用 NewArchiveFromReader(file),无需进行任何转换。你可以阅读关于Go接口的更多背景知识来了解为什么可以这样使用。

  2. 如果你愿意将文件提取到本地目录,你可以这样做,然后再读取这些文件(注意,可能包含拼写错误):

func extractAndOpenAll(*extractor.Archive) ([]*os.File, error) {
  err := a.Extract("/tmp/path") // 考虑使用 ioutil.TempDir()
  if err != nil {
    return nil, err
  }

  filestats, err := ioutil.ReadDir("/tmp/path")
  if err != nil {
    return nil, err
  }

  // 警告:所有这些文件句柄必须由调用者关闭,
  // 这就是为什么即使在错误情况下也返回文件列表的原因。
  // 如果你忘记关闭,你的进程可能会泄漏文件句柄。
  files := make([]*os.File, 0)
  for _, fs := range(filestats) {
    file, err := os.Open(fs.Name())
    if err != nil {
      return files, err
    }

    files = append(files, file)
  }

  return files, nil
}

也可以在不将文件写回磁盘的情况下使用归档文件(https://github.com/gen2brain/go-unarr#read-all-entries-from-archive),但是否应该这样做取决于你的下一步操作。

英文:
  1. os.File implements the io.Reader interface (because it has a Read([]byte) (int, error) method defined), so you can use NewArchiveFromReader(file) without any conversions needed. You can read up on Go interfaces for more background on why that works.
  2. If you're okay with extracting to a local directory, you can do that and then read the files back in (warning, may contain typos):
func extractAndOpenAll(*extractor.Archive) ([]*os.File, error) {
  err := a.Extract("/tmp/path") // consider using ioutil.TempDir()
  if err != nil {
    return nil, err
  }

  filestats, err := ioutil.ReadDir("/tmp/path")
  if err != nil {
    return nil, err
  }

  # warning: all these file handles must be closed by the caller, 
  # which is why even the error case here returns the list of files.
  # if you forget, your process might leak file handles. 
  files := make([]*os.File, 0)
  for _, fs := range(filestats) {
    file, err := os.Open(fs.Name())
    if err != nil {
      return files, err
    }

    files = append(files, file)
  }

  return files, nil
}

It is possible to use the archived files without writing back to disk (https://github.com/gen2brain/go-unarr#read-all-entries-from-archive), but whether or not you should do that instead depends on what your next step is.

huangapple
  • 本文由 发表于 2022年1月27日 03:16:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/70869050.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定