在Go语言中解析XML

huangapple go评论86阅读模式
英文:

Unmarshaling XML in Go

问题

我有以下的xml文件:

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->
<pinnacle_line_feed>
    <PinnacleFeedTime>1439954818555</PinnacleFeedTime>
    <lastContest>34317132</lastContest>
    <lastGame>218491030</lastGame>
    <events>
        <event>
            <event_datetimeGMT>2015-08-21 09:50</event_datetimeGMT>
            <gamenumber>483406220</gamenumber>
            <sporttype>Aussie Rules</sporttype>
            <league>AFL</league>
            <IsLive>No</IsLive>
            <participants>
                <participant>
                    <participant_name>Hawthorn Hawks</participant_name>
                    <contestantnum>1251</contestantnum>
                    <rotnum>1251</rotnum>
                    <visiting_home_draw>Visiting</visiting_home_draw>
                </participant>
                <participant>
                    <participant_name>Port Adelaide Power</participant_name>
                    <contestantnum>1252</contestantnum>
                    <rotnum>1252</rotnum>
                    <visiting_home_draw>Home</visiting_home_draw>
                </participant>
            </participants>
            <periods></periods>
        </event>
    </events>
</pinnacle_line_feed>

<!-- end snippet -->

我正在尝试使用Golang解析它,并已经编写了以下代码:

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->
package main
import (
    "fmt"
    "encoding/xml"
)

type Participant struct {
    XMLName            xml.Name `xml:"participant"`
    participant_name   string `xml:"participant_name"`
    contestantnum      int `xml:"contestantnum"`
    rotnum             int `xml:"rotnum"`
    visiting_home_draw string `xml:"visiting_home_draw"`
}

type Event struct {
    XMLName           xml.Name `xml:"event"`
    event_datetimeGMT string `xml:"event_datetimeGMT"`
    gamenumber        string `xml:"gamenumber"`
    sporttype         string `xml:"sporttype"`
    league            string `xml:"league"`
    IsLive            string `xml:"IsLive"`
    Participant       []Participant `xml:"participant"`
}

type Events struct {
    XMLName xml.Name `xml:"events"`
    Event   []Event `xml:"event"`
}

type Pinnacle_Line_Feed struct {
    XMLName          xml.Name `xml:"pinnacle_line_feed"`
    PinnacleFeedTime string `xml:"PinnacleFeedTime"`
    lastContest      string `xml:"lastContest"`
    lastGame         string `xml:"lastGame"`
    Events           []Events `xml:"events"`
}

<!-- end snippet -->

老实说,我已经在Golang中深入嵌套的XML解析方面进行了相当多的研究,但在这段代码的解组部分没有找到太多帮助。理想情况下,代码应该返回类似于Python字典的东西,例如:

{event_datetimeGMT: 2015-08-21, gamenumber: 483406220, ..., visiting_home_draw: "Home"}

编辑:

根据Nicolas的评论,我添加了以下内容。这段代码运行没有错误,但产生了一个空结果。

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->
func main() {
    pinny_xml := `
    <pinnacle_line_feed>
        <PinnacleFeedTime>1439954818555</PinnacleFeedTime>
        <lastContest>34317132</lastContest>
        <lastGame>218491030</lastGame>
        <events>
            <event>
                <event_datetimeGMT>2015-08-21 09:50</event_datetimeGMT>
                <gamenumber>483406220</gamenumber>
                <sporttype>Aussie Rules</sporttype>
                <league>AFL</league>
                <IsLive>No</IsLive>
                <participants>
                    <participant>
                        <participant_name>Hawthorn Hawks</participant_name>
                        <contestantnum>1251</contestantnum>
                        <rotnum>1251</rotnum>
                        <visiting_home_draw>Visiting</visiting_home_draw>
                    </participant>
                    <participant>
                        <participant_name>Port Adelaide Power</participant_name>
                        <contestantnum>1252</contestantnum>
                        <rotnum>1252</rotnum>
                        <visiting_home_draw>Home</visiting_home_draw>
                    </participant>
                </participants>
                <periods></periods>
            </event>
        </events>
    </pinnacle_line_feed>
    `

    xmlReader := bytes.NewReader([]byte(pinny_xml))
    yourPinnacleLineFeed := new(Pinnacle_Line_Feed)
    if err := xml.NewDecoder(xmlReader).Decode(yourPinnacleLineFeed); err != nil {
        return // or log.Panic(err.Error()) if in main
    }
}

<!-- end snippet -->
英文:

I have the following xml "file":

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

&lt;pinnacle_line_feed&gt;
    &lt;PinnacleFeedTime&gt;1439954818555&lt;/PinnacleFeedTime&gt;
    &lt;lastContest&gt;34317132&lt;/lastContest&gt;
    &lt;lastGame&gt;218491030&lt;/lastGame&gt;
    &lt;events&gt;
        &lt;event&gt;
            &lt;event_datetimeGMT&gt;2015-08-21 09:50&lt;/event_datetimeGMT&gt;
            &lt;gamenumber&gt;483406220&lt;/gamenumber&gt;
            &lt;sporttype&gt;Aussie Rules&lt;/sporttype&gt;
            &lt;league&gt;AFL&lt;/league&gt;
            &lt;IsLive&gt;No&lt;/IsLive&gt;
            &lt;participants&gt;
                &lt;participant&gt;
                    &lt;participant_name&gt;Hawthorn Hawks&lt;/participant_name&gt;
                    &lt;contestantnum&gt;1251&lt;/contestantnum&gt;
                    &lt;rotnum&gt;1251&lt;/rotnum&gt;
                    &lt;visiting_home_draw&gt;Visiting&lt;/visiting_home_draw&gt;
                    &lt;/participant&gt;
                &lt;participant&gt;
                    &lt;participant_name&gt;Port Adelaide Power&lt;/participant_name&gt;
                    &lt;contestantnum&gt;1252&lt;/contestantnum&gt;
                    &lt;rotnum&gt;1252&lt;/rotnum&gt;
                    &lt;visiting_home_draw&gt;Home&lt;/visiting_home_draw&gt;
                &lt;/participant&gt;
            &lt;/participants&gt;
            &lt;periods&gt;&lt;/periods&gt;
        &lt;/event&gt;
    &lt;/events&gt;
&lt;/pinnacle_line_feed&gt;

<!-- end snippet -->

I am attempting to parse this with Golang, and have written the below thus far:

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

package main
import (
	&quot;fmt&quot;
	&quot;encoding/xml&quot;
)

type Participant struct {
	XMLName            xml.Name `xml:&quot;participant&quot;`
	participant_name   string `xml:&quot;participant_name&quot;`
	contestantnum      int `xml:&quot;contestantnum&quot;`
	rotnum             int `xml:&quot;rotnum&quot;`
	visiting_home_draw string `xml:&quot;visiting_home_draw&quot;`
}

type Event struct {
	XMLName           xml.Name `xml:&quot;event&quot;`
	event_datetimeGMT string `xml:&quot;event_datetimeGMT&quot;`
	gamenumber        string `xml:&quot;gamenumber&quot;`
	sporttype         string `xml:&quot;sporttype&quot;`
	league            string `xml:&quot;league&quot;`
	IsLive            string `xml:&quot;IsLive&quot;`
	Participant       []Participant `xml:&quot;participant&quot;`
}

type Events struct {
	XMLName xml.Name `xml:&quot;events&quot;`
	Event   []Event `xml:&quot;event&quot;`
}

type Pinnacle_Line_Feed struct {
	XMLName          xml.Name `xml:&quot;pinnacle_line_feed&quot;`
	PinnacleFeedTime string `xml:&quot;PinnacleFeedTime&quot;`
	lastContest      string `xml:&quot;lastContest&quot;`
	lastGame         string `xml:&quot;lastGame&quot;`
	Events           []Events `xml:&quot;events&quot;`
}

<!-- end snippet -->

Frankly, I have researched quite a bit on deep nested xml parsing in Golang, and I haven't found much to help with the Unmarshal portion of this code. Ideally, the code would return something similar to a python dictionary, such as,

{event_datetimeGMT: 2015-08-21, gamenumber: 483406220, ... , visiting_home_draw: "Home"}

EDIT:

Based on Nicolas's comment below, I added the following. This runs without error, but produces an empty result.

<!-- begin snippet: js hide: false -->

<!-- language: lang-html -->

 func main() {
         pinny_xml := `
         &lt;pinnacle_line_feed&gt;
            &lt;PinnacleFeedTime&gt;1439954818555&lt;/PinnacleFeedTime&gt;
            &lt;lastContest&gt;34317132&lt;/lastContest&gt;
            &lt;lastGame&gt;218491030&lt;/lastGame&gt;
            &lt;events&gt;
                &lt;event&gt;
                    &lt;event_datetimeGMT&gt;2015-08-21 09:50&lt;/event_datetimeGMT&gt;
                    &lt;gamenumber&gt;483406220&lt;/gamenumber&gt;
                    &lt;sporttype&gt;Aussie Rules&lt;/sporttype&gt;
                    &lt;league&gt;AFL&lt;/league&gt;
                    &lt;IsLive&gt;No&lt;/IsLive&gt;
                    &lt;participants&gt;
                        &lt;participant&gt;
                            &lt;participant_name&gt;Hawthorn Hawks&lt;/participant_name&gt;
                            &lt;contestantnum&gt;1251&lt;/contestantnum&gt;
                            &lt;rotnum&gt;1251&lt;/rotnum&gt;
                            &lt;visiting_home_draw&gt;Visiting&lt;/visiting_home_draw&gt;
                            &lt;/participant&gt;
                        &lt;participant&gt;
                            &lt;participant_name&gt;Port Adelaide Power&lt;/participant_name&gt;
                            &lt;contestantnum&gt;1252&lt;/contestantnum&gt;
                            &lt;rotnum&gt;1252&lt;/rotnum&gt;
                            &lt;visiting_home_draw&gt;Home&lt;/visiting_home_draw&gt;
                        &lt;/participant&gt;
                    &lt;/participants&gt;
                    &lt;periods&gt;&lt;/periods&gt;
                &lt;/event&gt;
            &lt;/events&gt;
        &lt;/pinnacle_line_feed&gt;
        `

        xmlReader := bytes.NewReader([]byte(pinny_xml))
        yourPinnacleLineFeed := new(Pinnacle_Line_Feed)
        if err := xml.NewDecoder(xmlReader).Decode(yourPinnacleLineFeed); err != nil {
            return // or log.Panic(err.Error()) if in main
        }
}

<!-- end snippet -->

答案1

得分: 2

你可以像这样操作:

xmlReader := bytes.NewReader([]byte(your_xml_as_a_string_here))
yourPinnacleLineFeed := new(Pinnacle_Line_Feed)
if err := xml.NewDecoder(xmlReader).Decode(yourPinnacleLineFeed); err != nil {
    return // 或者在主函数中使用 log.Panic(err.Error())
}

如果你的 XML "文件"是一个字符串,那么可以这样操作。如果你从互联网上获取它(比如作为 HTTP 响应的主体),resp.Body 已经满足了 Reader 接口,所以你可以跳过第一行。如果你打开的是操作系统上的一个真实文件,你也可以将其作为 Reader 打开,操作是一样的。

编辑: 还有两件事情:

  • 你可以嵌套结构体并且为简单和清晰起见,省略 xml.Name 字段。
  • 我还注意到你忘记在结构体中添加 participants 级别,导致无法解组参与者。

下面是一个更简单的版本,它可以工作,并且还包含一个可选的函数,用于检查结果中的内容:

package main

import (
	"bytes"
	"encoding/json"
	"encoding/xml"
	"fmt"
	"log"
)

type Pinnacle_Line_Feed struct {
	PinnacleFeedTime string `xml:"PinnacleFeedTime"`
	LastContest      string `xml:"lastContest"`
	LastGame         string `xml:"lastGame"`
	Events           struct {
		Event []struct {
			Event_datetimeGMT string `xml:"event_datetimeGMT"`
			Gamenumber        string `xml:"gamenumber"`
			Sporttype         string `xml:"sporttype"`
			League            string `xml:"league"`
			IsLive            string `xml:"IsLive"`
			Participants      struct {
				Participant []struct {
					Participant_name   string `xml:"participant_name"`
					Contestantnum      int    `xml:"contestantnum"`
					Rotnum             int    `xml:"rotnum"`
					Visiting_home_draw string `xml:"visiting_home_draw"`
				} `xml:"participant"`
			} `xml:"participants"`
		} `xml:"event"`
	} `xml:"events"`
}

func main() {
	pinny_xml := `
         <pinnacle_line_feed>
            <PinnacleFeedTime>1439954818555</PinnacleFeedTime>
            <lastContest>34317132</lastContest>
            <lastGame>218491030</lastGame>
            <events>
                <event>
                    <event_datetimeGMT>2015-08-21 09:50</event_datetimeGMT>
                    <gamenumber>483406220</gamenumber>
                    <sporttype>Aussie Rules</sporttype>
                    <league>AFL</league>
                    <IsLive>No</IsLive>
                    <participants>
                        <participant>
                            <participant_name>Hawthorn Hawks</participant_name>
                            <contestantnum>1251</contestantnum>
                            <rotnum>1251</rotnum>
                            <visiting_home_draw>Visiting</visiting_home_draw>
                        </participant>
                        <participant>
                            <participant_name>Port Adelaide Power</participant_name>
                            <contestantnum>1252</contestantnum>
                            <rotnum>1252</rotnum>
                            <visiting_home_draw>Home</visiting_home_draw>
                        </participant>
                    </participants>
                    <periods></periods>
                </event>
            </events>
        </pinnacle_line_feed>
    `

	xmlReader := bytes.NewReader([]byte(pinny_xml))
	yourPinnacleLineFeed := new(Pinnacle_Line_Feed)
	if err := xml.NewDecoder(xmlReader).Decode(yourPinnacleLineFeed); err != nil {
		log.Panic(err.Error())
	}

	printX(yourPinnacleLineFeed)
}

func printX(x interface{}) (err error) {
	var xBytes []byte
	xBytes, err = json.MarshalIndent(x, "", "  ")
	if err != nil {
		return
	}
	fmt.Println(string(xBytes))
	return
}

希望对你有帮助!

英文:

You can do it like this :

xmlReader := bytes.NewReader([]byte(your_xml_as_a_string_here))
yourPinnacleLineFeed := new(Pinnacle_Line_Feed)
if err := xml.NewDecoder(xmlReader).Decode(yourPinnacleLineFeed); err != nil {
return // or log.Panic(err.Error()) if in main
}

This is if your xml "file" is a string. If you're getting it from the internet (as the body of an http response for instance), resp.Body is already going to satisfy a Reader, so you can skip the first line. If you're opening a real file on the OS, you can also open it as a Reader, same thing.

EDIT: Two more things:

  • You can nest the structs and drop the xml.Name fields for simplicity and clarity
  • I also noticed you forgot the participants level in the structs, leading for no participants to be unmarshaled

Here's a more simple version that works, with an optional function to check what you have inside the results :

package main
import (
&quot;bytes&quot;
&quot;encoding/json&quot;
&quot;encoding/xml&quot;
&quot;fmt&quot;
&quot;log&quot;
)
type Pinnacle_Line_Feed struct {
PinnacleFeedTime string `xml:&quot;PinnacleFeedTime&quot;`
LastContest      string `xml:&quot;lastContest&quot;`
LastGame         string `xml:&quot;lastGame&quot;`
Events           struct {
Event []struct {
Event_datetimeGMT string `xml:&quot;event_datetimeGMT&quot;`
Gamenumber        string `xml:&quot;gamenumber&quot;`
Sporttype         string `xml:&quot;sporttype&quot;`
League            string `xml:&quot;league&quot;`
IsLive            string `xml:&quot;IsLive&quot;`
Participants      struct {
Participant []struct {
Participant_name   string `xml:&quot;participant_name&quot;`
Contestantnum      int    `xml:&quot;contestantnum&quot;`
Rotnum             int    `xml:&quot;rotnum&quot;`
Visiting_home_draw string `xml:&quot;visiting_home_draw&quot;`
} `xml:&quot;participant&quot;`
} `xml:&quot;participants&quot;`
} `xml:&quot;event&quot;`
} `xml:&quot;events&quot;`
}
func main() {
pinny_xml := `
&lt;pinnacle_line_feed&gt;
&lt;PinnacleFeedTime&gt;1439954818555&lt;/PinnacleFeedTime&gt;
&lt;lastContest&gt;34317132&lt;/lastContest&gt;
&lt;lastGame&gt;218491030&lt;/lastGame&gt;
&lt;events&gt;
&lt;event&gt;
&lt;event_datetimeGMT&gt;2015-08-21 09:50&lt;/event_datetimeGMT&gt;
&lt;gamenumber&gt;483406220&lt;/gamenumber&gt;
&lt;sporttype&gt;Aussie Rules&lt;/sporttype&gt;
&lt;league&gt;AFL&lt;/league&gt;
&lt;IsLive&gt;No&lt;/IsLive&gt;
&lt;participants&gt;
&lt;participant&gt;
&lt;participant_name&gt;Hawthorn Hawks&lt;/participant_name&gt;
&lt;contestantnum&gt;1251&lt;/contestantnum&gt;
&lt;rotnum&gt;1251&lt;/rotnum&gt;
&lt;visiting_home_draw&gt;Visiting&lt;/visiting_home_draw&gt;
&lt;/participant&gt;
&lt;participant&gt;
&lt;participant_name&gt;Port Adelaide Power&lt;/participant_name&gt;
&lt;contestantnum&gt;1252&lt;/contestantnum&gt;
&lt;rotnum&gt;1252&lt;/rotnum&gt;
&lt;visiting_home_draw&gt;Home&lt;/visiting_home_draw&gt;
&lt;/participant&gt;
&lt;/participants&gt;
&lt;periods&gt;&lt;/periods&gt;
&lt;/event&gt;
&lt;/events&gt;
&lt;/pinnacle_line_feed&gt;
`
xmlReader := bytes.NewReader([]byte(pinny_xml))
yourPinnacleLineFeed := new(Pinnacle_Line_Feed)
if err := xml.NewDecoder(xmlReader).Decode(yourPinnacleLineFeed); err != nil {
log.Panic(err.Error())
}
printX(yourPinnacleLineFeed)
}
func printX(x interface{}) (err error) {
var xBytes []byte
xBytes, err = json.MarshalIndent(x, &quot;&quot;, &quot;  &quot;)
if err != nil {
return
}
fmt.Println(string(xBytes))
return
}

答案2

得分: 1

你需要将结构体中的所有字段都转换为大写。XML解码器需要这些字段被导出才能正常工作。可以在Go Playground中查看示例代码这里

英文:

You need to uppercase all of the fields in your structs. The xml decoder needs the fields to be exported in order to work properly. Go playground here.

huangapple
  • 本文由 发表于 2015年8月21日 02:34:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/32125816.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定