How to get file length in Go dynamically?

huangapple go评论84阅读模式
英文:

How to get file length in Go dynamically?

问题

我有以下代码片段:

func main() {
    // Some text we want to compress.
    original := "bird and frog"
    
    // Open a file for writing.
    f, _ := os.Create("C:\\programs\\file.gz")
    
    // Create gzip writer.
    w := gzip.NewWriter(f)
    
    // Write bytes in compressed form to the file.
    for i := 0; i < 10; i++ {
        // Get the row from the database as obtained from cursor.
        row := []byte("row from the database")
        
        // Compress the row.
        compressedRow := compress(row)
        
        // Write the compressed row to the file.
        w.Write(compressedRow)
        
        // Check if the file size exceeds the limit.
        if f.Size() > 50 {
            // Close the current file.
            w.Close()
            f.Close()
            
            // Open a new file for writing.
            f, _ = os.Create("C:\\programs\\file.gz")
            
            // Create a new gzip writer.
            w = gzip.NewWriter(f)
        }
    }
    
    // Close the file.
    w.Close()
    f.Close()
    
    fmt.Println("DONE")
}

func compress(data []byte) []byte {
    // Compress the data using gzip compression algorithm.
    // Return the compressed data.
    return data
}

这段代码实现了在文件大小达到一定阈值时关闭当前文件并打开一个新文件的功能。在每次写入压缩文档之前,通过检查文件大小是否超过限制来判断是否需要关闭当前文件并打开一个新文件。你可以根据实际需求修改代码中的数据库行获取逻辑和压缩逻辑。

英文:

I have the following code snippet:

func main() {
    // Some text we want to compress.
    original := &quot;bird and frog&quot;
    
    // Open a file for writing.
    f, _ := os.Create(&quot;C:\\programs\\file.gz&quot;)
    
    // Create gzip writer.
    w := gzip.NewWriter(f)
    
    // Write bytes in compressed form to the file.
    while ( looping over database cursor) {
       w.Write([]byte(/* the row from the database as obtained from cursor */))
    }
    
    // Close the file.
    w.Close()
    
    fmt.Println(&quot;DONE&quot;)
}

However, I wish to know a small modification. When the size of file reaches a certain threshold I want to close it and open a new file. And that too in compressed format.

For example:

Assume a database has 10 rows each row is 50 bytes.

Assume compression factor is 2, ie 1 row of 50 bytes is compressed to 25 bytes.

Assume a file size limit is 50 bytes.

Which means after every 2 records I should close the file and open a new file.

How to keep track of the file size while its still open and still writing compressed documents to it ?

答案1

得分: 2

gzip.NewWriter 接受一个 io.Writer。很容易实现自定义的 io.Writer 来满足你的需求。

例如:Playground

type MultiFileWriter struct {
    maxLimit      int
    currentSize   int
    currentWriter io.Writer
}

func (m *MultiFileWriter) Write(data []byte) (n int, err error) {
    if len(data)+m.currentSize > m.maxLimit {
        m.currentWriter = createNextFile()
    }
    m.currentSize += len(data)
    return m.currentWriter.Write(data)
}

注意:你需要处理一些边界情况,比如如果 len(data) 大于 maxLimit 会怎样。也许你不想将一条记录拆分到多个文件中。

英文:

gzip.NewWriter takes a io.Writer. It is easy to implement custom io.Writer that does what you want.

E.g. Playground

type MultiFileWriter struct {
	maxLimit      int
	currentSize   int
	currentWriter io.Writer
}

func (m *MultiFileWriter) Write(data []byte) (n int, err error) {
	if len(data)+m.currentSize &gt; m.maxLimit {
		m.currentWriter = createNextFile()
	}
    m.currentSize += len(data)
	return m.currentWriter.Write(data)
}

Note: You will need to handle few edge cases like what if len(data) is greater than the maxLimit. And may be you don't want to split a record across files.

答案2

得分: 1

你可以使用os.File.Seek方法来获取文件中的当前位置,因为你正在写入文件,所以当前位置将是文件的当前大小(以字节为单位)。

例如:

package main

import (
	"compress/gzip"
	"fmt"
	"os"
)

func main() {
	// 要压缩的文本。
	lines := []string{
		"this is a test",
		"the quick brown fox",
		"jumped over the lazy dog",
		"the end",
	}

	// 打开一个文件进行写入。
	f, err := os.Create("file.gz")
	if err != nil {
		panic(err)
	}

	// 创建gzip写入器。
	w := gzip.NewWriter(f)

	// 以压缩形式将字节写入文件。
	for _, line := range lines {
		w.Write([]byte(line))

		w.Flush()
		pos, err := f.Seek(0, os.SEEK_CUR)
		if err != nil {
			panic(err)
		}

		fmt.Printf("pos: %d\n", pos)
	}

	// 关闭文件。
	w.Close()

	// 调用w.Close()将写出任何剩余的数据和最终的校验和。
	pos, err := f.Seek(0, os.SEEK_CUR)
	if err != nil {
		panic(err)
	}
	fmt.Printf("pos: %d\n", pos)

	fmt.Println("DONE")
}

输出结果为:

pos: 30
pos: 55
pos: 83
pos: 94
pos: 107
DONE

我们可以使用wc命令进行确认:

$ wc -c file.gz
107 file.gz
英文:

You can use the os.File.Seek method to get your current position in the file, which as you're writing the file will be the current file size in bytes.

For example:

package main

import (
	&quot;compress/gzip&quot;
	&quot;fmt&quot;
	&quot;os&quot;
)

func main() {
	// Some text we want to compress.
	lines := []string{
		&quot;this is a test&quot;,
		&quot;the quick brown fox&quot;,
		&quot;jumped over the lazy dog&quot;,
		&quot;the end&quot;,
	}

	// Open a file for writing.
	f, err := os.Create(&quot;file.gz&quot;)
	if err != nil {
		panic(err)
	}

	// Create gzip writer.
	w := gzip.NewWriter(f)

	// Write bytes in compressed form to the file.
	for _, line := range lines {
		w.Write([]byte(line))

		w.Flush()
		pos, err := f.Seek(0, os.SEEK_CUR)
		if err != nil {
			panic(err)
		}

		fmt.Printf(&quot;pos: %d\n&quot;, pos)
	}

	// Close the file.
	w.Close()

	// The call to w.Close() will write out any remaining data
	// and the final checksum.
	pos, err := f.Seek(0, os.SEEK_CUR)
	if err != nil {
		panic(err)
	}
	fmt.Printf(&quot;pos: %d\n&quot;, pos)

	fmt.Println(&quot;DONE&quot;)
}

Which outputs:

pos: 30
pos: 55
pos: 83
pos: 94
pos: 107
DONE

And we can confirm with wc:

$ wc -c file.gz
107 file.gz

huangapple
  • 本文由 发表于 2022年9月14日 06:42:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/73709794.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定