如何读取一个gzipped的CSV文件?

huangapple go评论78阅读模式
英文:

How do I read a gzipped CSV file?

问题

我正在面临阅读存档的CSV文件时遇到一些问题。

所以我想使用csv包和gzip包来解决这个问题,但我不知道如何将它们结合起来使用。

gzip.Reader(p []bytes)csv.Reader()具有不同的签名。

这是我的读取函数:

func reader(filename string, c chan string) {
    fi, err := os.Open(filename)
    var r *bufio.Reader
    if err != nil {
        fmt.Println("%q",err)
        os.Exit(1)
    }

    fz, err := g.NewReader(fi)

    if err != nil {
        r = bufio.NewReader(fi)
    } else {
        r = bufio.NewReader(fz)
    }

    for {
        line, err := r.ReadString('\n')
        if err != nil {
            fmt.Println("done reading file")
            c <- "done"
            break
        } else {
            c <- fmt.Sprintf("%q",strings.Fields(line))
        }
    }
}

你有什么建议吗?

英文:

I'm facing some issues in reading archived CSV files.

So I want to use the csv package and the gzip package for this, but I don't know how to combine them.

The gzip.Reader(p []bytes) and the csv.Reader() have different signatures.

This is my reader function:

func reader(filename string, c chan string) {
    fi, err := os.Open(filename)
 	var r *bufio.Reader
 	if err != nil {
    	fmt.Println(&quot;%q&quot;,err)
	    os.Exit(1)
	}

	fz, err := g.NewReader(fi)

	if err != nil {
    	r = bufio.NewReader(fi)
 	}else {
    	r = bufio.NewReader(fz)
	}


    for {
   	    line, err := r.ReadString(&#39;\n&#39;)
		if err != nil {
			fmt.Println(&quot;done reading file&quot;)
			c &lt;- &quot;done&quot;
			break
    	}else{
			c &lt;- fmt.Sprintf(&quot;%q&quot;,strings.Fields(line))
		}
	}
} 

Do you have any suggestions ?

答案1

得分: 31

只需打开文件进行读取,然后使用该文件句柄进行gzip压缩,再将该文件句柄用于csv读取器:

package main

import (
	"compress/gzip"
	"encoding/csv"
	"fmt"
	"log"
	"os"
)

func main() {
	f, err := os.Open("data.csv.gz")
	if err != nil {
		log.Fatal(err)
	}
	defer f.Close()
	gr, err := gzip.NewReader(f)
	if err != nil {
		log.Fatal(err)
	}
	defer gr.Close()

	cr := csv.NewReader(gr)
	rec, err := cr.Read()
	if err != nil {
		log.Fatal(err)
	}
	for _, v := range rec {
		fmt.Println(v)
	}
}

这是我的(未压缩的)data.csv文件:

"foo","bar","baz"

程序的输出结果是:

foo
bar
baz

如预期所示。

英文:

Just Open the file for reading, then use that file handle with gzip and then use that file handle for the csv reader:

package main

import (
	&quot;compress/gzip&quot;
	&quot;encoding/csv&quot;
	&quot;fmt&quot;
	&quot;log&quot;
	&quot;os&quot;
)

func main() {
	f, err := os.Open(&quot;data.csv.gz&quot;)
	if err != nil {
		log.Fatal(err)
	}
	defer f.Close()
	gr, err := gzip.NewReader(f)
	if err != nil {
		log.Fatal(err)
	}
	defer gr.Close()

	cr := csv.NewReader(gr)
	rec, err := cr.Read()
	if err != nil {
		log.Fatal(err)
	}
	for _, v := range rec {
		fmt.Println(v)
	}
}

and this is my (uncompressed) data.csv:

&quot;foo&quot;,&quot;bar&quot;,&quot;baz&quot;

The output of my program is:

foo
bar
baz

as expected.

huangapple
  • 本文由 发表于 2014年7月10日 17:46:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/24673335.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定