Golang正则表达式:FindAllStringSubmatch转为[]string

huangapple go评论81阅读模式
英文:

Golang Regex: FindAllStringSubmatch to []string

问题

我从Amazon S3下载了一个多行文件,格式如下:

ColumnAv1 ColumnBv1 ColumnCv1 ...
ColumnAv2 ColumnBv2 ColumnCv2 ...

该文件的类型是字节类型。然后我想用正则表达式解析它:

matches := re.FindAllSubmatch(file,-1)

然后我想逐行将结果传递给一个以[]string作为输入的函数(string[0]ColumnAv1string[1]ColumnBv2,...)。

我应该如何将[][][]byte的结果转换为包含第一行、第二行等的[]string?我想我应该在循环中进行操作,但我无法使其正常工作:

for i:=0;i<len(len(matches);i++{
    tmp:=myfunction(???)
}

顺便问一下,为什么FindAllSubmatch函数返回[][][]byteFindAllStringSubmatch函数返回[][]string

(很抱歉,我现在无法访问我的真实示例,所以语法可能不正确)

英文:

I download a multiline file from Amazon S3 in format like:

ColumnAv1 ColumnBv1 ColumnCv1 ...
ColumnAv2 ColumnBv2 ColumnCv2 ...

the file is of type byte. Then I want to parse this with regex:

matches := re.FindAllSubmatch(file,-1)

then I want to feed result row by row to function which takes []string as input (string[0] is ColumnAv1, string[1] is ColumnBv2, ...).

How should I convert result of [][][]byte to []string containing first, second, etc row? I suppose I should do it in a loop, but I cannot get this working:

for i:=0;i<len(len(matches);i++{
    tmp:=myfunction(???)
}

BTW, Why does function FindAllSubmatch return [][][]byte whereas FindAllStringSubmatch return [][]string?

(Sorry I don't have right now access to my real example, so the syntax may not be proper)

答案1

得分: 3

这些内容的翻译如下:

包的文档中详细解释了这一切。
阅读解释如下的段落:
> 有16种正则表达式匹配和识别匹配文本的方法。它们的名称由这个正则表达式匹配:
>
> Find(All)?(String)?(Submatch)?(Index)?

在你的情况下,你可能想使用FindAllStringSubmatch


在Go中,string只是一个只读的[]byte
你可以选择继续传递[]byte变量,
或者将[]byte值转换为string

var byteSlice = []byte{'F','o','o'}
var str string

str = string(byteSlice)
英文:

It's all explained extensively in the package's documentation.
Read the parapgraph which explains :
> There are 16 methods of Regexp that match a regular expression and identify the matched text. Their names are matched by this regular expression:
>
> Find(All)?(String)?(Submatch)?(Index)?

In your case, you probably want to use FindAllStringSubmatch.


In Go, a string is just a read-only []byte.
You can choose to either keep passing []byte variables around,
or cast the []byte values to string :

var byteSlice = []byte{'F','o','o'}
var str string

str = string(byteSlice)

答案2

得分: 1

你可以通过两个嵌套循环来遍历字节结果,就像遍历字符串结果一样,并且在第二个循环中将字节片段转换为字符串:

package main

import "fmt"

func main() {
    f := [][][]byte{{{'a', 'b', 'c'}}}
    for _, line := range f {
        for _, match := range line { // match 是 []byte 类型
            fmt.Println(string(match))
        }
    }
}

Playground

英文:

You can simply iterate through the bytes result as you would do for strings result using two nested loop, and just convert slice of bytes to a string in the second loop:

package main

import "fmt"

func main() {
    f := [][][]byte{{{'a', 'b', 'c'}}}
    for _, line := range f {
	    for _, match := range line { // match is a type of []byte
	    	fmt.Println(string(match))
	    }
    }
}

Playground

huangapple
  • 本文由 发表于 2015年4月29日 16:05:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/29937787.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定