在Go中提取子字符串

huangapple go评论98阅读模式
英文:

Extracting substrings in Go

问题

我正在尝试从控制台读取整行输入(包括空格),然后对其进行处理。使用bufio.ReadString,换行符与输入一起读取,所以我想出了以下代码来修剪换行符:

input,_:=src.ReadString('\n')
inputFmt:=input[0:len(input)-2]+"" //需要手动添加字符串的结尾

是否有更优雅的方法来做到这一点?也就是说,是否已经有一个库可以在提取子字符串时自动处理结束的空字节?

(是的,我知道已经有一种方法可以读取没有换行符的行https://stackoverflow.com/questions/6141604/go-readline-string,但我更希望找到更优雅的字符串操作方法。)

英文:

I'm trying to read an entire line from the console (including whitespace), then process it. Using bufio.ReadString, the newline character is read together with the input, so I came up with the following code to trim the newline character:

input,_:=src.ReadString('\n')
inputFmt:=input[0:len(input)-2]+"" //Need to manually add end of string

Is there a more idiomatic way to do this? That is, is there already a library that takes care of the ending null byte when extracting substrings for you?

(Yes, I know there is already a way to read a line without the newline character in https://stackoverflow.com/questions/6141604/go-readline-string but I'm looking more for elegant string manipulation.)

答案1

得分: 230

看起来你对切片的工作方式和字符串存储格式感到困惑,这与C语言中的方式不同。

在Go中,任何切片都存储了长度(以字节为单位),所以你不需要关心len操作的开销:不需要计数。
Go字符串没有以空字符结尾,所以你不需要移除空字节,并且在切片后添加一个空字符串后也不需要添加1

要移除最后一个字符(如果它是一个字节字符),只需执行以下操作:

inputFmt:=input[:len(input)-1]
英文:

It looks like you're confused by the working of slices and the string storage format, which is different from what you have in C.

  • any slice in Go stores the length (in bytes), so you don't have to care about the cost of the len operation : there is no need to count
  • Go strings aren't null terminated, so you don't have to remove a null byte, and you don't have to add 1 after slicing by adding an empty string.

To remove the last char (if it's a one byte char), simply do

inputFmt:=input[:len(input)-1]

答案2

得分: 89

警告:仅对字符串进行操作只适用于ASCII,并且在输入为非ASCII UTF-8编码字符时会计算错误,甚至可能破坏字符,因为它会在多字节字符中间截断。

这是一个支持UTF-8的版本:

// 注意:这个版本不支持多个Unicode码点,比如指定表情符号的肤色或性别:https://unicode.org/emoji/charts/full-emoji-modifiers.html
func substr(input string, start int, length int) string {
asRunes := []rune(input)

if start >= len(asRunes) {
    return ""
}

if start+length > len(asRunes) {
    length = len(asRunes) - start
}

return string(asRunes[start : start+length])

}

英文:

WARNING: operating on strings alone will only work with ASCII and will count wrong when input is a non-ASCII UTF-8 encoded character, and will probably even corrupt characters since it cuts multibyte chars mid-sequence.

Here's a UTF-8-aware version:

// NOTE: this isn't multi-Unicode-codepoint aware, like specifying skintone or
//       gender of an emoji: https://unicode.org/emoji/charts/full-emoji-modifiers.html
func substr(input string, start int, length int) string {
	asRunes := []rune(input)
	
	if start >= len(asRunes) {
		return ""
	}
	
	if start+length > len(asRunes) {
		length = len(asRunes) - start
	}
	
	return string(asRunes[start : start+length])
}

答案3

得分: 31

Go字符串不以空字符结尾,要删除字符串的最后一个字符,可以简单地执行以下操作:

s = s[:len(s)-1]
英文:

Go strings are not null terminated, and to remove the last char of a string you can simply do:

s = s[:len(s)-1]

答案4

得分: 18

这是在Go中执行子字符串操作的简单示例

package main

import "fmt"

func main() {

  value := "address;bar"

  // 从索引2开始截取到字符串的末尾
  substring := value[2:len(value)]
  fmt.Println(substring)

}
英文:

This is the simple one to perform substring in Go

package main

import "fmt"

func main() {

  value := "address;bar"

  // Take substring from index 2 to length of string
  substring := value[2:len(value)]
  fmt.Println(substring)

}

答案5

得分: 11

为了避免在零长度输入时出现恐慌,将截断操作包装在一个if语句中。

input, _ := src.ReadString('\n')
var inputFmt string
if len(input) > 0 {
	inputFmt = input[:len(input)-1]
}
// 对inputFmt进行一些操作
英文:

To avoid a panic on a zero length input, wrap the truncate operation in an if

input, _ := src.ReadString('\n')
var inputFmt string
if len(input) > 0 {
	inputFmt = input[:len(input)-1]
}
// Do something with inputFmt

答案6

得分: 3

获取子字符串

  1. 找到“sp”的位置

  2. 使用数组逻辑切割字符串

https://play.golang.org/p/0Redd_qiZM

英文:

To get substring

  1. find position of "sp"

  2. cut string with array-logical

https://play.golang.org/p/0Redd_qiZM

答案7

得分: 3

8年后,我偶然发现了这个宝石,但我不认为OP的原始问题真正得到了回答:

虽然bufio.Reader类型支持ReadLine()方法,可以同时删除\r\n\n,但它被设计为低级函数,使用起来很麻烦,因为需要进行重复检查。

在我看来,一种惯用的去除空白字符的方法是使用Golang的strings库:

input, _ = src.ReadString('\n')

// 更具体地解决尾随换行符的问题
actual = strings.TrimRight(input, "\r\n")

// 或者如果你不介意去除前导和尾随的空白字符
actual := strings.TrimSpace(input)

在Golang playground中可以看到这个示例的运行效果:https://play.golang.org/p/HrOWH0kl3Ww

英文:

8 years later I stumbled upon this gem, and yet I don't believe OP's original question was really answered:

> so I came up with the following code to trim the newline character

While the bufio.Reader type supports a ReadLine() method which both removes \r\n and \n it is meant as a low-level function which is awkward to use because repeated checks are necessary.

IMO an idiomatic way to remove whitespace is to use Golang's strings library:

input, _ = src.ReadString('\n')

// more specific to the problem of trailing newlines
actual = strings.TrimRight(input, "\r\n")

// or if you don't mind to trim leading and trailing whitespaces 
actual := strings.TrimSpace(input)

See this example in action in the Golang playground: https://play.golang.org/p/HrOWH0kl3Ww

答案8

得分: 2

希望这个函数对某人有帮助,

str := "Error 1062: Duplicate entry 'user@email.com' for key 'users.email'"
getViolatedValue(str)

这用于从主字符串中提取使用了**'**的子字符串

func getViolatedValue(msg string) string {
	i := strings.Index(msg, "'")

	if i > -1 {
		part := msg[i+1:]
		j := strings.Index(part, "'")
		if j > -1 {
			return part[:j]
		}
		return ""
	} else {
		return ""
	}
}
英文:

Hope this function will be helpful for someone,

str := "Error 1062: Duplicate entry 'user@email.com' for key 'users.email'"
getViolatedValue(str)

This is used to substring that used ' in the main string

func getViolatedValue(msg string) string {
	i := strings.Index(msg, "'")

	if i > -1 {
		part := msg[i+1:]
		j := strings.Index(part, "'")
		if j > -1 {
			return part[:j]
		}
		return ""
	} else {
		return ""
	}
}

huangapple
  • 本文由 发表于 2012年9月7日 10:43:59
  • 转载请务必保留本文链接:https://go.coder-hub.com/12311033.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定