How to coerce _string-type_ to _fixed-length string-type_ in Go?

huangapple go评论80阅读模式
英文:

How to coerce _string-type_ to _fixed-length string-type_ in Go?

问题

我有一个简单的程序,用于打开一个文本文件,每行有两个用空格分隔的元素(名字和姓氏)。然后,我将名字和姓氏放入一个结构体(T)中,该结构体放入一个结构体切片([ ]T)中。

有一个限制,名字和姓氏必须限制在20个字符以内。当读取文本文件时,我无法将文本强制转换为固定长度。我该如何将一般的字符串强制转换为"最多20个字符的字符串"?

package main

import (
	"bufio"
	"fmt"
	"os"
	"strings"
)

type T struct {
	fname [20]string
	lname [20]string
}

func main() {
	// 询问文件名(fn)
	fmt.Println("输入文件名,例如:'names.txt'")
	var fn string
	_, _ = fmt.Scan(&fn)

	// 打开文件
	file, err := os.Open(fn)
	if err != nil {
		fmt.Println(err)
	}
	defer file.Close()

	// 创建一个切片,用于存放T类型的结构体
	var slice []T

	// 每行的计数器
	i := 0

	// 逐行读取并迭代
	scanner := bufio.NewScanner(file)
	for scanner.Scan() {
		fmt.Println(scanner.Text())
		strline := scanner.Text()
		// words := strings.Split(strline, " ")
		// var words [20]string
		words := strings.Fields(strline)
		// fmt.Println(words, len(words))
		slice[i] = T{fname: words[0], lname: words[1]}
		i = i + 1
	}

	if err := scanner.Err(); err != nil {
		fmt.Println(err)
	}

	// 返回包含T类型元素的切片
	fmt.Println(slice)
}

在终端中,

go build read.go
# command-line-arguments
./read.go:43:23: 无法将words[0](string类型的变量)作为结构体字面值中的[20]string类型使用
./read.go:43:40: 无法将words[1](string类型的变量)作为结构体字面值中的[20]string类型使用
英文:

I have a simple program to open a text file with two space-separated elements per line (first name and last name). Then, I put first-name and last-name in to a struct (T), which goes in a slice of such structs ([ ]T).

There is a catch, first-name and last-name must be limited by 20 characters. When reading the text-file, I'm unable to coerce the text to be of that fixed-length. How can I coerce a general string into a "maximum-20char-string"?

package main

import (
	"bufio"
	"fmt"
	"os"
	"strings"
)

type T struct {
	fname [20]string
	lname [20]string
}

func main() {
	// Ask for file name (fn)
	fmt.Println("Enter file name, e.g.: 'names.txt'")
	var fn string
	_, _ = fmt.Scan(&fn)

	// Open file
	file, err := os.Open(fn)
	if err != nil {
		fmt.Println(err)
	}
	defer file.Close()

	// Create a slice, in which the T constructs will be populated.
	var slice []T

	// Counter for each line
	i := 0

	// Read each line and iterate on it
	scanner := bufio.NewScanner(file)
	for scanner.Scan() {
		fmt.Println(scanner.Text())
		strline := scanner.Text()
		// words := strings.Split(strline, " ")
		// var words [20]string
		words := strings.Fields(strline)
		// fmt.Println(words, len(words))
		slice[i] = T{fname: words[0], lname: words[1]}
		i = i + 1
	}

	if err := scanner.Err(); err != nil {
		fmt.Println(err)
	}

	// return slice with T-type elements
	fmt.Println(slice)
}

In the terminal,

go build read.go
# command-line-arguments
./read.go:43:23: cannot use words[0] (variable of type string) as type [20]string in struct literal
./read.go:43:40: cannot use words[1] (variable of type string) as type [20]string in struct literal

答案1

得分: 0

你将需要做出各种决策:

  • 你想要 20 个字节还是 20 个符文?
  • 对于较短的字符串,你想要做什么?
  • 对于较长的字符串,你想要做什么?

一旦你做出了其中一些决策,就可以编写一些 Go 代码了:

func to20bytes(s string) [20]byte {
    ... 一些代码 ...
}

或者:

func to20runes(s string) ([20]rune, error) {
    ... 一些代码 ...
}

例如。

由于包含 20 个字节或符文的数组始终包含确切的 20 个字节或符文,这些数组不能包含较短的"字符串"(我在这里用引号括起来,因为字节或符文数组不是 Go 字符串)。在古代,人们经常在字节数组中填充空格,所以第一个函数可能会这样写:

func to20bytes(s string) (ret [20]byte) {
    for i, c := range s {
        if c > 127 {
            panic("非 ASCII 字符")
        }
        ret[i] = byte(c) // 如果 i >= 20 则 panic
    }
    for i := len(s); i < 20; i++ {
        ret[i] = ' '
    }
    return
}

这个函数很糟糕;请参考playground 示例了解原因。返回一个 error 或截断可能更好,但只有你能决定;使用符文可能更好,但只有你能决定。

英文:

You are going to need to make various decisions:

  • Do you want 20 bytes, or 20 runes?
  • What do you want to do with shorter strings?
  • What do you want to do with longer strings?

Once you've made some of these decisions, it is time to write some Go code:

func to20bytes(s string) [20]byte {
... some code ...
}

or:

func to20runes(s string) ([20]rune, error) {
... some code ...
}

for instance.

Since an array of 20 bytes or runes always contains exactly 20 bytes or runes, these arrays cannot contain shorter "strings" (I put the word "strings" in quotes here as an array of bytes or runes is not a Go string). In ancient times, people often space-padded arrays of bytes, so that the first function might read:

func to20bytes(s string) (ret [20]byte) {
for i, c := range s {
if c &gt; 127 {
panic(&quot;non-ASCII-character&quot;)
}
ret[i] = byte(c) // panic if i &gt;= 20
}
for i := len(s); i &lt; 20; i++ {
ret[i] = &#39; &#39;
}
return
}

This function is terrible; see the playground example to see why. Having an error return, or truncating, is probably better, but only you can say; using runes is probably better, but only you can say.

huangapple
  • 本文由 发表于 2022年4月24日 01:07:29
  • 转载请务必保留本文链接:https://go.coder-hub.com/71981889.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定