使用Go从文本文件中读取数据

huangapple go评论89阅读模式
英文:

Reading data from a text file with Go

问题

我有一个包含一些文本数据的文本文件,我想要读取它。不幸的是,我找不到方法来做到这一点。

这是一个例子

5 4
1 2 - 酸奶
2 0 X 鸡汤
3 1 X 奶酪
4 3 X 火腿
2
3
4
0

该文件由三个部分组成。第一部分是标题(第一行),第二部分是记录列表,最后一部分是unit64值列表。

标题只包含两个值,一个uint64后跟一个unit16。第二个值是记录数,也是第三部分中的值数,因为这些数字是相同的。

记录是一个unit64值,后跟一个uint16值,后跟一个只能是X或-的单个字符,后跟一个utf-8编码的字符串直到行尾。数据是通过使用fmt.Fprintf()写入文件的。

第三部分包含unit64值。

我已经花了几个小时尝试找出如何从文本文件中读取这些数据,但找不到方法。

如果str只包含属于该数字的数字,则我只能使用strconv.ParseUint(str, 0, 64)或uint16(strconv.ParseUint(str, 0, 16))。

我查看了bufio以使用Reader,但我只能获取最多的行。

我应该使用bufio.Scanner,但我无法确定如何从文档中使用它。

英文:

I have a text file containing some text data I would like to read in.
Unfortunately I can't find the way to do it.

Here is an example

5 4
1 2 - Yogurt
2 0 X Chicken soup
3 1 X Cheese 
4 3 X Ham 
2
3
4
0

The file is made of three parts. The first part is the header (first line), the second part is a list of records, and the last part is a list of unit64 values.

The header contains only two values, a uint64 followed by a unit16. The second value is the number of records and also the number of values in the third part since these numbers are the same.

A record is a unit64 value, followed by a uint16 value, followed by a single char that can only be X or -, followed by a utf-8 encoded string up to the end of line. The data has be written into the file by using fmt.Fprintf().

The third part contains uint64 values.

I've spending some hours now trying to find out how to read this data out of the text file and can't find a way.

I can only use strconv.ParseUint(str, 0, 64) or uint16(strconv.ParseUint(str, 0, 16)) if str contains only the digits belonging to the number.
I looked into bufio to use the Reader but I can get at most lines.
I should probably use the bufio.Scanner but I can't determine how to use it from the documentation.

答案1

得分: 4

橡皮鸭调试效果再次发挥作用。在提出问题后,我自己找到了答案。以下是答案,以防其他人遇到相同的问题。

在下面的示例中,我将简单地打印出读取的数据。

import (
    "bufio"
    "fmt"
    "log"
    "os"
    "strings"
)


func loadFile(fileName string) {
    // 打开文件并实例化一个读取器
    file, err := os.Open(fileName)
    if err != nil {
        log.Fatal(err)
    }
    reader := bufio.NewReader(file)

    var (
        value0 uint64,
        nbrRows uint16
    )

    // 读取头部值
    if _, err := fmt.Fscanf(reader, "%d %d\n", &value0, &nbrRows); err != nil {
        log.Fatal(err)
    }
    
    // 迭代行
    for i := uint16(0); i < nbrRows; i++ {
        var (
            value1 uint64,
            value2 uint16,
            value3 string,
            value4 string
        )
        
        // 读取前三个行的值
        if _, err := fmt.Fscanf(reader, "%d %d %s\n", &value1, &value2, &value3); err != nil {
            log.Fatal(err)
        }
        // 读取行的剩余部分
        if value4, err := reader.ReadString('\n'); err != nil {
            log.Fatal(err)
        }
        value4 = strings.Trim(value4, " \n")

        // 显示解析的数据
        fmt.Printf("%d %d %s '%s'\n", value1, value2, value3, value4)
    }

    // 迭代只包含一个整数值的行
    for i := uint16(0); i < nbrRows; i++ {
        var value5 uint64
        
        // 读取值
        if _, err := fmt.Fscanf(reader, "%d\n", &value5); err != nil {
            log.Fatal(err)
        }

        // 显示解析的数据
        fmt.Printf("%d\n", value5)
    }
}

这段代码假设value4不以空格或换行符开头和结尾,这是我的情况。这是因为Trim()函数会将它们删除。

英文:

The Rubber duck debugging effect worked again. After asking the question, I found out the answer by my self. Here it is in case other people share the same problem I had.

In the following example I'll simply print out the data read

import (
    &quot;bufio&quot;
    &quot;fmt&quot;
    &quot;log&quot;
    &quot;os&quot;
    &quot;strings&quot;
)


func loadFile( fileName string ) {
    // Open file and instantiate a reader
    file, err := os.Open(fileName)
    if err != nil {
        log.Fatal(err)
    }
    reader := bufio.NewReader(file)

    var {
        value0 uint64,
        nbrRows uint16
    }

    // Read header values
    if _,err := fmt.Fscanf(reader, &quot;%d %d\n&quot;, &amp;value0, &amp;nbrRows); err != nil {
        log.Fatal(err)
    }
    
    // Iterate on the rows 
    for i := uint16(0); i &lt; nbrRows; i++ {
        var {
            value1 uint64,
            value2 uint16,
            value3 string,
            value4 string
        }
        
        // Read first three row values
        if _,err := fmt.Fscanf(reader, &quot;%d %d %s\n&quot;, &amp;value1, &amp;value2, &amp;value3); err != nil {
            log.Fatal(err)
        }
        // Read remain of line
        if value4,err := reader.ReadString(&#39;\n&#39;); err != nil {
            log.Fatal(err)
        }
        value4 = strings.Trim(value4,&quot; \n&quot;)

        // Display the parsed data
        fmt.Printf(&quot;%d %d %s &#39;%s&#39;\n&quot;, value1, value2, value3, value4)
    }

    // Iterate on the rows containing a single integer value
    for i := uint16(0); i &lt; nbrRows; i++ {
        var value5 uint64
        
        // Read the value
        if _,err := fmt.Fscanf(reader, &quot;%d\n&quot;, &amp;value5); err != nil {
            log.Fatal(err)
        }

        // Display the parsed data
        fmt.Printf(&quot;%d\n&quot;, value5)
    }
}

This code assume that the value4 doesn't start and end with spaces or newlines, which is my case. This is because the Trim() call will remove them.

huangapple
  • 本文由 发表于 2013年6月16日 00:50:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/17125857.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定