使用Golang的正则表达式如何获取xlsx单元格数据?

huangapple go评论65阅读模式
英文:

get the xlsx cell data using the the golang regexp?

问题

我使用正则表达式从.xlsx文件中获取数据,但我对正则表达式不熟悉。有人可以帮助我吗?

问题:

  1. 如何获取字符串{{range .txt}},并去掉标签"..."?

  2. 如何从r="3"中获取"3",并从",,..."中获取"A3,B3,C3..."?

提前感谢!

英文:

I use the regexp expression to get the data from .xlsx file. but i am poor and a newer in regexp. Anyone could help me?

package main

import (
        "fmt"
        "regexp"
)

func main() {
        input := `
    	<sheetData>
		<row r="2" spans="1:15">
		<c r="A2" s="5" ><v>{{range .txt}}</v></c>
		<c r="B2" s="5" t="s"><v>1</v></c>
		<c r="C2" s="5" t="s"><v>2</v></c>
		<c r="D2" s="5" t="s"><v>3</v></c>
		<c r="E2" s="5" />
		<c r="K2" s="6" t="s"><v>21</v></c>
    </row> 
	<row r="3" spans="1:15">
		<c r="A3" s="5" t="s"><v>0</v></c>
		<c r="B3" s="5" t="s"><v>1</v></c>
		<c r="C3" s="5" t="s"><v>2</v></c>
		<c r="D3" s="5" t="s"><v>3</v></c>
		<c r="E3" s="5" />
		<c r="K3" s="6" t="s"><v>21</v></c>
    </row> 
	</sheetData>`
        r := regexp.MustCompile(`<row[^>]*?r="(\d+)"[^>].*?>.*?[(<v>(.*?)<\/v>.*?)]<\/row>`)
        r2 := regexp.MustCompile(`<v>(.*?)</v>`)
	    row:=r.FindAllString(input,-1)
	    for _,v:=range row {
        fmt.Println(r.ReplaceAllStringFunc(v, func(m string) string {
               match:=r2.FindAllString(v,-1)
	    	for kk,vv:=range match {
	     	fmt.Println(kk,vv)
		     fmt.Println(r2.ReplaceAllString(v, ""))		 	 
	    }  
      }))
	    }
    }	

Question:

  1. How to get the string {{range .txt}} ,and throw off the tag"<row><c>..."

  2. How to get the "3" from r=&quot;3&quot; ,and get the "A3,B3,C3..." from the "<c r="A3",<c r="B3",<c r="C3"...."

Thanks in advance!

答案1

得分: 3

我认为regexp不适合这个任务。尝试使用xml:

import "encoding/xml"

// 可能可以为这些选择更好的名称。
type C struct {
	XMLName xml.Name `xml:"c"`
	V       string   `xml:"v"`
	R       string   `xml:"r,attr"`
}
type Row struct {
	XMLName xml.Name `xml:"row"`
	C       []C      `xml:"c"`
}
type Result struct {
	XMLName xml.Name `xml:"sheetData"`
	Row     []Row    `xml:"row"`
}
v := Result{}

err := xml.Unmarshal([]byte(input), &v)
if err != nil {
	fmt.Printf("error: %v", err)
	return
}
for _, r := range v.Row {
	for _, c := range r.C {
		fmt.Printf("%v %v\n", c.V, c.R)
	}
}

这将打印:

{{range .txt}} A2
1 B2
2 C2
3 D2
...
英文:

I think regexp is the wrong tool for this job. Try xml:

import &quot;encoding/xml&quot;

// Could probably pick better names for these.
type C struct {
    XMLName xml.Name `xml:&quot;c&quot;`
    V       string   `xml:&quot;v&quot;`
	R       string   `xml:&quot;r,attr&quot;`
}
type Row struct {
	XMLName xml.Name `xml:&quot;row&quot;`
	C       []C      `xml:&quot;c&quot;`
}
type Result struct {
	XMLName xml.Name `xml:&quot;sheetData&quot;`
	Row     []Row    `xml:&quot;row&quot;`
}
v := Result{}

err := xml.Unmarshal([]byte(input), &amp;v)
if err != nil {
	fmt.Printf(&quot;error: %v&quot;, err)
	return
}
for _, r := range v.Row {
	for _, c := range r.C {
		fmt.Printf(&quot;%v %v\n&quot;, c.V, c.R)
	}
}

This will print:

{{range .txt}} A2
1 B2
2 C2
3 D2
...

huangapple
  • 本文由 发表于 2015年1月19日 17:01:11
  • 转载请务必保留本文链接:https://go.coder-hub.com/28020979.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定