英文:
mixed XML decoding in golang preserving order
问题
我需要从一个XML中提取出优惠信息,但要考虑节点的顺序:
<items>
<offer/>
<product>
<offer/>
<offer/>
</product>
<offer/>
<offer/>
</items>
以下结构体可以解码这些值,但会分成两个不同的切片,导致原始顺序丢失:
type Offers struct {
Offers []offer `xml:"items>offer"`
Products []offer `xml:"items>product>offer"`
}
有什么想法吗?
英文:
I need to extract offers from an XML, but taking into consideration nodes order:
<pre>
<items>
<offer/>
<product>
<offer/>
<offer/>
</product>
<offer/>
<offer/>
</items>
</pre>
The following struct would decode the values, but into two different slices, which will cause loss of original order:
<pre>
type Offers struct {
Offers []offer xml:"items>offer"
Products []offer xml:"items>product>offer"
}
</pre>
Any ideas?
答案1
得分: 8
一种方法是重写UnmarshalXML
方法。假设我们的输入如下所示:
<doc>
<head>My Title</head>
<p>A first paragraph.</p>
<p>A second one.</p>
</doc>
我们希望反序列化文档并保留head和paragraph的顺序。为了保持顺序,我们需要一个切片。为了适应head和p,我们需要一个接口。我们可以这样定义我们的文档:
type Document struct {
XMLName xml.Name `xml:"doc"`
Contents []Mixed `xml:",any"`
}
,any
注释将任何元素收集到Contents
中。它是一个Mixed
类型,我们需要将其定义为类型:
type Mixed struct {
Type string // 在这里只保留"head"或"p"
Value interface{} // 保留值,我们也可以在这里使用字符串
}
我们需要对反序列化过程有更多的控制,所以我们通过实现UnmarshalXML
将Mixed
转换为xml.Unmashaler
。我们根据开始元素的名称(例如head或p)决定代码路径。在这里,我们只是用一些值填充我们的Mixed
结构,但你基本上可以在这里做任何事情:
func (m *Mixed) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
switch start.Name.Local {
case "head", "p":
var e string
if err := d.DecodeElement(&e, &start); err != nil {
return err
}
m.Value = e
m.Type = start.Name.Local
default:
return fmt.Errorf("unknown element: %s", start)
}
return nil
}
将所有内容放在一起,使用上述结构的用法可能如下所示:
func main() {
s := `
<doc>
<head>My Title</head>
<p>A first paragraph.</p>
<p>A second one.</p>
</doc>
`
var doc Document
if err := xml.Unmarshal([]byte(s), &doc); err != nil {
log.Fatal(err)
}
fmt.Printf("#%v", doc)
}
这将打印出:
#{{ doc} [{head My Title} {p A first paragraph.} {p A second one.}]}
我们保留了顺序并保留了一些类型信息。你可以使用多种不同的类型进行反序列化,而不仅仅是一个单一的类型,比如Mixed
。这种方法的代价是你的容器(在这里是文档的Contents
字段)是一个接口。要执行任何特定于元素的操作,你需要进行类型断言或使用一些辅助方法。
完整的代码在playground上:https://play.golang.org/p/fzsUPPS7py
英文:
One way would be to overwrite the UnmarshalXML
method. Let's say our input looks like this:
<doc>
<head>My Title</head>
<p>A first paragraph.</p>
<p>A second one.</p>
</doc>
We want to deserialize the document and preserve the order of the head and paragraphs. For order we will need a slice. To accommodate both head
and p
, we will need an interface. We could define our document like this:
type Document struct {
XMLName xml.Name `xml:"doc"`
Contents []Mixed `xml:",any"`
}
The ,any
annotation will collect any element into Contents
. It is a Mixed
type, which we need to define as a type:
type Mixed struct {
Type string // just keep "head" or "p" in here
Value interface{} // keep the value, we could use string here, too
}
We need more control over the deserialization process, so we turn Mixed
into an xml.Unmashaler
by implementing UnmarshalXML
. We decide on the code path based on the name of the start element, e.g. head
or p
. Here, we only populate our Mixed
struct with some values, but you can basically do anything here:
func (m *Mixed) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
switch start.Name.Local {
case "head", "p":
var e string
if err := d.DecodeElement(&e, &start); err != nil {
return err
}
m.Value = e
m.Type = start.Name.Local
default:
return fmt.Errorf("unknown element: %s", start)
}
return nil
}
Putting it all together, usage of the above structs could look like this:
func main() {
s := `
<doc>
<head>My Title</head>
<p>A first paragraph.</p>
<p>A second one.</p>
</doc>
`
var doc Document
if err := xml.Unmarshal([]byte(s), &doc); err != nil {
log.Fatal(err)
}
fmt.Printf("#%v", doc)
}
Which would print.
#{{ doc} [{head My Title} {p A first paragraph.} {p A second one.}]}
We preserved order and kept some type information. Instead of a single type, like Mixed
you could use many different types for the deserialization. The cost of this approach is that your container - here the Contents
field of the document - is an interface. To do anything element-specific, you'll need a type assertion or some helper method.
Complete code on play: https://play.golang.org/p/fzsUPPS7py
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论