2022年10月6日 13:08:26go评论92阅读模式

英文:

How to split string in Go based on certain prefix and suffix?

问题

假设我有这个大字符串：

>13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a

我想将它拆分成一个数组，以1324作为前缀，0d0a作为后缀。结果是一个包含3个元素的数组：

> arr[0] = 13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a

> arr1 = 13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a

> arr[2] = 1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a

以下是我的代码：

package main

import (
	"fmt"
	"regexp"
)

func main() {

	var testData = "13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a"

	re := regexp.MustCompile("^1324[0-9a-zA-Z]*0d0a")

	matches := re.FindAllString(testData, -1)

	for _, m := range matches {
		fmt.Printf("%s\n", m)
	}
}

它只是打印出相同的整个字符串，这很可能意味着我的正则表达式是错误的。正确的形式是什么？

英文:

Let's say I have this big string:

I want it to be splitted into array, with 1324 as prefix and 0d0a as suffix. The result is an array of 3 elements:

> arr[0] = 13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a

> arr1 = 13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a

> arr[2] = 1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a

Here's my code:

package main

import (
	&quot;fmt&quot;
	&quot;regexp&quot;
)

func main() {

	var testData = &quot;13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a&quot;

	re := regexp.MustCompile(&quot;^1324[0-9a-zA-Z]*0d0a&quot;)

	matches := re.FindAllString(testData, -1)

	for _, m := range matches {
		fmt.Printf(&quot;%s\n&quot;, m)
	}
}

It simply prints the same entire string, which very likely means my regex is wrong. What's the proper form?

答案1

得分: 3

你的正则表达式有两个问题。插入符号（^）表示你只想匹配字符串的开头，所以根据定义，你只会得到一个结果。另一个问题是星号（*）是一个贪婪量词，意味着它会尽可能匹配前面的字符集。这意味着正则表达式会一直搜索字符串的结尾以找到后缀，并且只有在找不到时才会回溯。你需要的是一个勉强量词，即*?，它只匹配满足正则表达式的最少字符数。

综合起来，你的正则表达式应该是"1324[0-9a-zA-Z]*?0d0a"。我在 Go Playground 上测试了一下，似乎可以得到你想要的结果。https://go.dev/play/p/qolk3vHNxKT

英文:

Your regex has a two issues. The caret (^) means you want to only match the beginning of the string, so by definition you will only get one result. The other issue is that the * is a greedy quantifier, meaning it will match as many of the previous character set as it can. This means the regex will search until the end of the string for the suffix and backtrack only if it can't find it. What you want is a reluctant quantifier, so *?, which only matches the minimum number of characters it can to satisfy the regex.

Putting it together, your regex string should be "1324[0-9a-zA-Z]*?0d0a". I tested it in Go playground and it seems to get the results that you want. https://go.dev/play/p/qolk3vHNxKT

答案2

得分: 1

使用strings.Split函数对关键字1324进行拆分，然后将其添加到每个条目之前，这样会更简单。

results类型是一个由提供的分隔符拆分的字符串切片。遍历它一次，将分隔符添加到每个条目前面，以获得所需的结果。

注意，在我的 M1 MacBook Pro 上，使用Split()示例在运行 Go 的基准测试时比正则表达式选项表现得更好。

package main

import (
	"fmt"
	"strings"
)

func main() {
	var output []string
	var testData = "13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a"
	results := strings.Split(testData, "1324")
	for idx := range results {
		if len(results[idx]) > 0 {
			output = append(output, fmt.Sprintf("%s%s", "1324", results[idx]))
		}
	}
}

请注意，在我的 M1 MacBook Pro 上，使用Split()示例在运行 Go 的基准测试时比正则表达式选项表现得更好。

英文:

It will be much simpler to use strings.Split on the keyword 1324 and then later prefix it to each entry.

The results type is a slice of strings each split by the delimiter provided. Iterate over it once to prefix the delimiter to get the desired result

package main

import (
	&quot;fmt&quot;
	&quot;strings&quot;
)

func main() {
	var output []string
	var testData = &quot;13242222160a06032c06cf00ca5c160bdc70102dfe0a12bc00a3b101000000cd01d60d0a13242222160a06032c0ccf00ca5bf10bdc74d029d05401fe0a12bc00a3b101000000d1e4270d0a1324222160a06032c1e0a12bc00a3b101000000d233ed0d0a&quot;
	results := strings.Split(testData, &quot;1324&quot;)
	for idx := range results {
		if len(results[idx]) &gt; 0 {
			output = append(output, fmt.Sprintf(&quot;%s%s&quot;, &quot;1324&quot;, results[idx]))
		}
	}
}

Note that on my M1 Macbook Pro, the Split() example performed far better than the regex option, when ran with with Go's benchmarks.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How to split string in Go based on certain prefix and suffix?

问题

答案1

答案2

将httprouter包装为具有日志记录功能的方法？

How do I use Go routines in this example?

使Filebeat在使用旧注册表时以旧文件偏移量启动

无法获取 Office 365 的仅应用程序令牌。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论