2015年7月16日 14:31:52go评论97阅读模式

英文:

Parse input from a particular format

问题

让我们假设我有以下字符串："Algorithms 1" by Robert Sedgewick。这是从终端输入的内容。

这个字符串的格式总是：

以双引号开头
后面是字符（可能包含空格）
接着是双引号
然后是一个空格
然后是单词"by"
再接着是一个空格
最后是字符（可能包含空格）

了解了上述格式后，我该如何读取这个字符串？

我尝试使用fmt.Scanf()，但它会将每个空格后面的单词作为单独的值处理。我查看了正则表达式，但我无法确定是否有一种函数可以获取值而不仅仅是测试有效性。

英文:

Let us say I have the following string: "Algorithms 1" by Robert Sedgewick. This is input from the terminal.

The format of this string will always be:

Starts with a double quote
Followed by characters (may contain space)
Followed by double quote
Followed by space
Followed by the word "by"
Followed by space
Followed by characters (may contain space)

Knowing the above format, how do I read this?

I tried using fmt.Scanf() but that would treat a word after each space as a separate value. I looked at regular expressions but I could not make out if there is a function using which I could GET values and not just test for validity.

答案1

得分: 5

1）使用字符搜索

输入格式非常简单，你可以直接使用strings.IndexRune()函数来实现字符搜索：

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

s = s[1:]                      // 排除第一个双引号
x := strings.IndexRune(s, '&#39;&quot;&#39;) // 找到第二个双引号
title := s[:x]                 // 标题在两个双引号之间
author := s[x+5:]              // 作者紧随其后，排除掉" by "，剩下的就是作者

// 打印结果
fmt.Println("Title:", title)
fmt.Println("Author:", author)

输出结果：

Title: Algorithms 1
Author: Robert Sedgewick

你可以在Go Playground上尝试运行。

2）使用分割

另一种解决方案是使用strings.Split()函数：

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

parts := strings.Split(s, `&quot;`)
title := parts[1]      // 第一部分为空，第二部分是标题
author := parts[2][4:] // 第三部分是作者，但要去掉" by "

// 输出结果与前面相同

你可以在Go Playground上尝试运行。

3）使用“巧妙”的分割

如果我们去掉第一个双引号，可以使用分隔符" by "进行分割：

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

parts := strings.Split(s[1:], `&quot; by `)
title := parts[0]  // 第一部分正好是标题
author := parts[1] // 第二部分正好是作者

// 输出结果与前面相同

你可以在Go Playground上尝试运行。

4）使用正则表达式

如果你仍然想使用正则表达式，可以这样做：

使用括号定义你想要获取的子匹配项。你想要获取两部分：双引号之间的标题和紧随其后的by后面的作者。你可以使用regexp.FindStringSubmatch()函数来获取匹配的部分。注意返回切片中的第一个元素是完整的输入，所以相关的部分是后续的元素：

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

r := regexp.MustCompile(`&quot;([^&quot;]*)&quot; by (.*)`)
parts := r.FindStringSubmatch(s)
title := parts[1]  // 第一部分始终是完整的输入，第二部分是标题
author := parts[2] // 第三部分正好是作者

// 输出结果与前面相同

你可以在Go Playground上尝试运行。

英文:

1) With character search

The input format is so simple, you can simply use character search implemented in strings.IndexRune():

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

s = s[1:]                      // Exclude first double qote
x := strings.IndexRune(s, &#39;&quot;&#39;) // Find the 2nd double quote
title := s[:x]                 // Title is between the 2 double qotes
author := s[x+5:]              // Which is followed by &quot; by &quot;, exclude that, rest is author

Printing results with:

fmt.Println(&quot;Title:&quot;, title)
fmt.Println(&quot;Author:&quot;, author)

Output:

Title: Algorithms 1
Author: Robert Sedgewick

Try it on the Go Playground.

2) With splitting

Another solution would be to use strings.Split():

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

parts := strings.Split(s, `&quot;`)
title := parts[1]      // First part is empty, 2nd is title
author := parts[2][4:] // 3rd is author, but cut off &quot; by &quot;

Output is the same. Try it on the Go Playground.

3) With a "tricky" splitting

If we cut off the first double quote, we may do a splitting by the separator

`&quot; by `

If we do so, we will have exactly the 2 parts: title and author. Since we cut off first double quote, the separator can only be at the end of the title (the title cannot contain double quotes as per your rules):

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

parts := strings.Split(s[1:], `&quot; by `)
title := parts[0]  // First part is exactly the title
author := parts[1] // 2nd part is exactly the author

Try it on the Go Playground.

4) With regexp

If after all the above solutions you still want to use regexp, here's how you could do it:

Use parenthesis to define submatches you want to get out. You want 2 parts: the title between quotes and the author that follows by. You can use regexp.FindStringSubmatch() to get the matching parts. Note that the first element in the returned slice will be the complete input, so relevant parts are the subsequent elements:

s := `&quot;Algorithms 1&quot; by Robert Sedgewick`

r := regexp.MustCompile(`&quot;([^&quot;]*)&quot; by (.*)`)
parts := r.FindStringSubmatch(s)
title := parts[1]  // First part is always the complete input, 2nd part is the title
author := parts[2] // 3rd part is exactly the author

Try it on the Go Playground.

答案2

得分: 4

你应该使用分组（括号）来提取你想要的信息：

"([\w\s]*)" by ([\w\s]+)\.

这将返回两个分组：

[1-13] Algorithms 1
[18-34] Robert Sedgewick

现在应该有一个正则表达式方法来从文本中获取所有匹配项。结果将包含一个匹配对象，其中包含这些分组。

我认为在Go语言中可以使用FindAllStringSubmatch方法。
（https://github.com/StefanSchroeder/Golang-Regex-Tutorial/blob/master/01-chapter2.markdown）

在这里测试一下：
https://regex101.com/r/cT2sC5/1

英文:

You should use groups (parentheses) to get out the information you want:

&quot;([\w\s]*)&quot;\sby\s([\w\s]+)\.

This returns two groups:

[1-13] Algorithms 1
[18-34] Robert Sedgewick

Now there should be a regex method to get all matches out of a text. The result will contain a match object which then contains the groups.

I think in go it is: FindAllStringSubmatch
(https://github.com/StefanSchroeder/Golang-Regex-Tutorial/blob/master/01-chapter2.markdown)

Test it out here:
https://regex101.com/r/cT2sC5/1

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

解析特定格式的输入

问题

答案1

1）使用字符搜索

2）使用分割

3）使用“巧妙”的分割

4）使用正则表达式

1) With character search

2) With splitting

3) With a "tricky" splitting

4) With regexp

答案2

如何重构重复使用指针的 Golang 代码库？

如何配置使用Go编写的Windows服务的故障操作？

“如果不支持ReadByte，将其包装在bufio.NewReader中”模式

如何在Go中从另一个包中导入结构体

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论