Correct way to import numeric csv data in go

huangapple go评论72阅读模式

Correct way to import numeric csv data in go



1.5, 2.3, 4.4
1.1, 5.3, 2.4


我使用了go csv库来解决这个问题。这会创建一个**[][]string**,然后我使用for循环将矩阵解析为**[][]float64**。

func readCSV(filepath string) [][]float64 {
    csvfile, err := os.Open(filepath)
    if err != nil {
        return nil

    reader := csv.NewReader(csvfile)
    stringMatrix, err := reader.ReadAll()


    matrix := make([][]float64, len(stringMatrix))

    for i := range stringMatrix {
        matrix[i] = make([]float64, len(stringMatrix[0]))
        for y := range stringMatrix[i] {
            matrix[i][y], err = strconv.ParseFloat(stringMatrix[i][y], 64)

    return matrix




I want to read a file in csv format containing only numeric values (with decimals) and store it on a matrix so I can perform operations on them. The file looks like this:

1.5, 2.3, 4.4
1.1, 5.3, 2.4

It may have thousands of lines and more than 3 columns.

I solved this using the go csv library. This creates a [][]string and after I use a for loop to parse the matrix into [][]float64.

func readCSV(filepath string) [][]float64 {

    csvfile, err := os.Open(filepath)
    if err != nil {
	    return nil

    reader := csv.NewReader(csvfile)
    stringMatrix, err := reader.ReadAll()


    matrix := make([][]float64, len(stringMatrix))

    //Parse string matrix into float64
    for i := range stringMatrix {
	    matrix[i] = make([]float64, len(stringMatrix[0]))
	    for y := range stringMatrix[i] {
		    matrix[i][y], err = strconv.ParseFloat(stringMatrix[i][y], 64)

    return matrix

I was wondering if this is a correct and efficient way of doing it or if there is a better way.

Like using reader.Read() instead and parse each line while it's being read. I don't know but it feel like I'm doing a lot duplicate work.


得分: 4



  • 逐行读取reader并将数据追加到matrix中;你不需要一次性将整个stringMatrix存储在内存中;
  • 逐行读取reader并逐行处理数据。也许你不需要将matrix存储在内存中,也许你可以在读取数据时进行处理,从未将所有数据一次性存储在内存中。这取决于你的程序的其余部分,以及它如何使用CSV数据。



It all depends on how you want to use the data. Your code isn't efficient in terms of memory because you read the entire CSV content in memory (stringMatrix) and then you create another variable to hold the data converted to float64 (matrix). So if your CSV file is 1 GB in size, your program would use 1 GB of RAM for stringMatrix + a lot more for matrix.

You can optimize the code by either:

  • Reading the reader line by line and appending the data to matrix; you don't need to have the entire stringMatrix in memory at once;
  • Reading the reader line by line and processing that data line by line. Maybe you don't need to have matrix in memory as well, maybe you can process the data as you read it and never have everything in memory at once. It depends on the rest of your program, on how it needs to use the CSV data.

Your program can use a few bytes of RAM instead of gigabytes if you use the second method above, if you don't need to return the entire CSV data from that function.

  • 本文由 发表于 2017年9月15日 01:48:34
  • 转载请务必保留本文链接:



:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
