Golang:将文件追加到现有的tar归档文件中。

huangapple go评论92阅读模式
英文:

Golang: append file to an existing tar archive

问题

如何在Go中将文件追加到现有的tar存档中?我在文档中没有看到明显的方法。

我有一个已经创建好的tar文件,我想在关闭后再添加更多内容。

编辑

根据文档中的示例和给出的答案进行修改,我仍然没有得到预期的结果。前三个文件被写入了tar文件,但是当我关闭并重新打开文件进行写入时,新文件从未被写入。代码运行正常,我不知道我漏掉了什么。

以下代码给我一个包含三个文件的tar文件:readme.txt,gopher.txt,todo.txt。foo.bar从未被写入。

package main

import (
	"archive/tar"
	"log"
	"os"
)

func main() {
	f, err := os.Create("/home/jeff/Desktop/test.tar")
	if err != nil {
		log.Fatalln(err)
	}

	tw := tar.NewWriter(f)

	var files = []struct {
		Name, Body string
	}{
		{"readme.txt", "This archive contains some text files."},
		{"gopher.txt", "Gopher names:\nGeorge\nGeoffrey\nGonzo"},
		{"todo.txt", "Get animal handling licence."},
	}
	for _, file := range files {
		hdr := &tar.Header{
			Name: file.Name,
			Size: int64(len(file.Body)),
		}
		if err := tw.WriteHeader(hdr); err != nil {
			log.Fatalln(err)
		}
		if _, err := tw.Write([]byte(file.Body)); err != nil {
			log.Fatalln(err)
		}
	}
	if err := tw.Close(); err != nil {
		log.Fatalln(err)
	}
	f.Close()

	// 打开文件并追加更多内容

	f, err = os.OpenFile("/home/jeff/Desktop/test.tar", os.O_APPEND|os.O_WRONLY, os.ModePerm)
	if err != nil {
		log.Fatalln(err)
	}
	tw = tar.NewWriter(f)

	test := "this is a test"

	hdr := &tar.Header{
		Name: "foo.bar",
		Size: int64(len(test)),
	}

	if err := tw.WriteHeader(hdr); err != nil {
		log.Fatalln(err)
	}

	if _, err := tw.Write([]byte(test)); err != nil {
		log.Fatalln(err)
	}

	if err := tw.Close(); err != nil {
		log.Fatalln(err)
	}
	f.Close()

}
英文:

How would I append a file to an existing tar archive in Go? I don't see anything obvious in the docs on how to do it.

I have a tar file that has already been created and I want to add more to it after it has already been closed.

EDIT

Altering the example in the docs and following the answer given, I'm still not getting the expected result. The first three files are being written to the tar but when I close and open up the file again to write to it, the new file is never being written. The code runs fine. I don't know what I'm missing.

The following code gives me a tar file with three files in it: readme.txt, gopher.txt, todo.txt. foo.bar never gets written.

package main
import (
"archive/tar"
"log"
"os"
)
func main() {
f, err := os.Create("/home/jeff/Desktop/test.tar")
if err != nil {
log.Fatalln(err)
}
tw := tar.NewWriter(f)
var files = []struct {
Name, Body string
}{
{"readme.txt", "This archive contains some text files."},
{"gopher.txt", "Gopher names:\nGeorge\nGeoffrey\nGonzo"},
{"todo.txt", "Get animal handling licence."},
}
for _, file := range files {
hdr := &tar.Header{
Name: file.Name,
Size: int64(len(file.Body)),
}
if err := tw.WriteHeader(hdr); err != nil {
log.Fatalln(err)
}
if _, err := tw.Write([]byte(file.Body)); err != nil {
log.Fatalln(err)
}
}
if err := tw.Close(); err != nil {
log.Fatalln(err)
}
f.Close()
// Open up the file and append more things to it
f, err = os.OpenFile("/home/jeff/Desktop/test.tar", os.O_APPEND|os.O_WRONLY, os.ModePerm)
if err != nil {
log.Fatalln(err)
}
tw = tar.NewWriter(f)
test := "this is a test"
hdr := &tar.Header{
Name: "foo.bar",
Size: int64(len(test)),
}
if err := tw.WriteHeader(hdr); err != nil {
log.Fatalln(err)
}
if _, err := tw.Write([]byte(test)); err != nil {
log.Fatalln(err)
}
if err := tw.Close(); err != nil {
log.Fatalln(err)
}
f.Close()
}

答案1

得分: 14

tar文件规范指出:

>> 一个tar归档文件由一系列512字节的记录组成。每个文件系统对象需要一个头记录,其中存储基本元数据(路径名、所有者、权限等),以及零个或多个包含任何文件数据的记录。归档的结尾由两个完全由零字节组成的记录表示。

在Go语言中,添加这两个填充零字节的记录的实现在这里

为了绕过tar文件格式的尾部(基本上是1024个字节的空白),你可以将以下代码替换为:

f, err = os.OpenFile("/home/jeff/Desktop/test.tar", os.O_APPEND|os.O_WRONLY, os.ModePerm)
if err != nil {
log.Fatalln(err)
}
tw = tar.NewWriter(f)

替换为:

f, err = os.OpenFile("/home/jeff/Desktop/test.tar", os.O_RDWR, os.ModePerm)
if err != nil {
log.Fatalln(err)
}
if _, err = f.Seek(-1024, os.SEEK_END); err != nil {
log.Fatalln(err)
}
tw = tar.NewWriter(f)

它以读/写模式打开文件(而不是追加/只写模式),然后从文件末尾的1024字节处开始写入。

它可以工作,但是这是一个可怕的hack。

编辑:在更好地理解tar文件规范之后,我不再认为这是一个可怕的hack。

完整代码:http://play.golang.org/p/0zRScmY4AC

英文:

The tar file specification states:

>> A tar archive consists of a series of 512-byte records. Each file system
object requires a header record which stores basic metadata (pathname,
owner, permissions, etc.) and zero or more records containing any file
data. The end of the archive is indicated by two records consisting
entirely of zero bytes.

The Go implementation of adding these two zero filled records happens here .

To get around the tar file format trailer (basically 1024 bytes of nothing) you could replace the lines:

f, err = os.OpenFile("/home/jeff/Desktop/test.tar", os.O_APPEND|os.O_WRONLY, os.ModePerm)
if err != nil {
log.Fatalln(err)
}
tw = tar.NewWriter(f)

With:

f, err = os.OpenFile("/home/jeff/Desktop/test.tar", os.O_RDWR, os.ModePerm)
if err != nil {
log.Fatalln(err)
}
if _, err = f.Seek(-1024, os.SEEK_END); err != nil {
log.Fatalln(err)
}
tw = tar.NewWriter(f)

It opens the file read / write (instead of append / write-only) and then seeks to 1024 bytes before the end of the file and writes from there.

It works, <s>but it is a horrible hack.</s>

EDIT: After understanding the tar file spec a little better, I no longer believe this is such a hack.

Full code: http://play.golang.org/p/0zRScmY4AC

答案2

得分: 4

这只是一个写入接口,所以在写入文件头之后,将字节写入其中。

import (
  "archive/tar"
  "os"
)

f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, os.ModePerm)
if err != nil {
// 在这里处理错误
}

hdr := tar.Header{}
// 填充你的文件头
tw := tar.NewWriter(f)
// 追加一个文件
tw.WriteHeader(hdr)
tw.Write(content_of_file_as_bytes)

http://golang.org/pkg/archive/tar/#Writer 可以告诉你所需的一切。

编辑:事实证明,tar文件在关闭时会写入一个尾部。因此,即使你在tar存档中写入新数据,它也不会在尾部之后被读取。所以看起来你需要先读取tar存档,然后将整个存档重新写入磁盘,这是次优的方法。该包不支持必要的附加功能,所以这是我目前能推荐的最好方法。

英文:

It's just a writer interface so write bytes to it after writing your files header.

import (
&quot;archive/tar&quot;
&quot;os&quot;
)
f, err := os.OpenFile(path, os.O_APPEND|os.O_WRONLY, os.ModePerm)
if err != nil {
// handle error here
}
hdr := tar.Header{}
// populate your header
tw := tar.NewWriter(f)
// append a file
tw.WriteHeader(hdr)
tw.Write(content_of_file_as_bytes)

http://golang.org/pkg/archive/tar/#Writer tells you all you need to know.

EDIT: It turns out that tar files get a trailer written to the end when it's closed. So even though you are writing new data to the tar archive it won't be read past that trailer. So instead it looks like you'll have to read in the tar archive first and then rewrite the whole archive to disk which is suboptimal. The package doesn't support the necessary stuff to append to them though so that's the best I can recommend right now.

答案3

得分: 0

我发现@Intermernet提供的解决方案基本上是正确的,除非你也控制写入者,否则存档末尾的填充通常是任意的。

就我目前所知,这对我来说是有效的:

var lastFileSize, lastStreamPos int64
tr := tar.NewReader(output)
for {
    hdr, err := tr.Next()
    if err == io.EOF {
        break
    }
    if err != nil {
        return err
    }
    lastStreamPos, err = output.Seek(0, io.SeekCurrent)
    if err != nil {
        return err
    }
    lastFileSize = hdr.Size
}

const blockSize = 512
newOffset := lastStreamPos + lastFileSize
// 移动到下一个最近的块边界(除非我们已经在边界上)
if (newOffset % blockSize) != 0 {
    newOffset += blockSize - (newOffset % blockSize) 
}
_, err := output.Seek(newOffset, io.SeekStart)
if err != nil {
    return err
}

tw := tar.NewWriter(output)
defer tw.Close()

// ... 继续像往常一样向tw写入 ...

我不完全确定为什么这样做可以起作用。但基本思想是扫描存档,找到最后一个文件的起始偏移量,然后跳过它,然后对齐到最近的块边界,然后从那里开始覆盖写入。

英文:

I found the accepted solution by @Intermernet to be basically correct, except that the padding at the end of an archive is generally arbitrary (unless you control the writer too).

This works for me as far as I can tell so far:

var lastFileSize, lastStreamPos int64
tr := tar.NewReader(output)
for {
	hdr, err := tr.Next()
	if err == io.EOF {
		break
	}
	if err != nil {
		return err
	}
	lastStreamPos, err = output.Seek(0, io.SeekCurrent)
	if err != nil {
		return err
	}
	lastFileSize = hdr.Size
}

const blockSize = 512
newOffset := lastStreamPos + lastFileSize
// shift to next-nearest block boundary (unless we are already on it)
if (newOffset % blockSize) != 0 {
	newOffset += blockSize - (newOffset % blockSize) 
}
_, err := output.Seek(newOffset, io.SeekStart)
if err != nil {
	return err
}

tw := tar.NewWriter(output)
defer tw.Close()

// ... proceed to write to tw as usual ...

I'm not 100% sure why this works. But the basic idea is to scan the archive, find the offset of the start of the last file, then skip over it, then align to the nearest block boundary, then start overwriting from there.

huangapple
  • 本文由 发表于 2013年8月20日 06:15:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/18323995.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定