Golang xml.Unmarshal接口类型

huangapple go评论81阅读模式
英文:

Golang xml.Unmarshal interface types

问题

使用golang中的xml包,我在解组非同质类型的列表时遇到了问题。考虑以下XML文档,其中嵌套元素是非同质类型的列表:

<mydoc>
  <foo>Foo</foo>
  <bar>Bar</bar>
  <foo>Another Foo</foo>
  <foo>Foo #3</foo>
  <bar>Bar 2</bar>
</mydoc>

以下是用于测试XML解组/组合的golang代码(也可以在go playground上找到):

package main

import "encoding/xml"
import "fmt"

const sampleXml = `
<mydoc>
  <foo>Foo</foo>
  <bar>Bar</bar>
  <foo>Another Foo</foo>
  <foo>Foo #3</foo>
  <bar>Bar 2</bar>
</mydoc>
`

type MyDoc struct {
  XMLName xml.Name `xml:"mydoc"`
  Items   []Item
}

type Item interface {
  IsItem()
}

type Foo struct {
  XMLName xml.Name `xml:"foo"`
  Name    string   `xml:",chardata"`
}

func (f Foo) IsItem() {}

type Bar struct {
  XMLName xml.Name `xml:"bar"`
  Nombre  string   `xml:",chardata"`
}

func (b Bar) IsItem() {}

func main() {
  doMarshal()
  doUnmarshal()
}

func doMarshal() {
  myDoc := MyDoc{
    Items: []Item{
      Foo{Name: "Foo"},
      Bar{Nombre: "Bar"},
      Foo{Name: "Another Foo"},
      Foo{Name: "Foo #3"},
      Bar{Nombre: "Bar 2"},
    },
  }
  bytes, err := xml.MarshalIndent(myDoc, "", "  ")
  if err != nil {
    panic(err)
  }
  // 打印与上面的“sampleXml”相同的XML文档。
  println(string(bytes))
}

func doUnmarshal() {
  myDoc := MyDoc{}
  err := xml.Unmarshal([]byte(sampleXml), &myDoc)
  if err != nil {
    panic(err)
  }
  // 无法将“Item”元素反序列化为相应的结构体。
  fmt.Printf("ERR: %#v", myDoc)
}

你会发现doMarshal()生成了我期望的XML文档;然而,doUnmarshal()无法将“Item”元素反序列化为相应的结构体。我尝试了一些更改,但似乎没有任何方法可以正确地进行反序列化(创建myDoc.Items的存储空间,将“Items”的类型更改为[]*Item [和其他操作],调整XML标签等)。

有什么办法可以使xml.Unmarshal(...)能够反序列化不相关类型的元素列表吗?

英文:

Using the xml package in golang I'm having trouble unmarshalling a list of non-homogenous types. Consider the following XML document whose nested elements are a list of non-homogenous types:

<!-- language: lang-xml -->

&lt;mydoc&gt;
&lt;foo&gt;Foo&lt;/foo&gt;
&lt;bar&gt;Bar&lt;/bar&gt;
&lt;foo&gt;Another Foo&lt;/foo&gt;
&lt;foo&gt;Foo #3&lt;/foo&gt;
&lt;bar&gt;Bar 2&lt;/bar&gt;
&lt;/mydoc&gt;

And the following golang code to test XML un/marshalling (also here on the go playground):

<!-- language: lang-go -->

package main
import &quot;encoding/xml&quot;
import &quot;fmt&quot;
const sampleXml = `
&lt;mydoc&gt;
&lt;foo&gt;Foo&lt;/foo&gt;
&lt;bar&gt;Bar&lt;/bar&gt;
&lt;foo&gt;Another Foo&lt;/foo&gt;
&lt;foo&gt;Foo #3&lt;/foo&gt;
&lt;bar&gt;Bar 2&lt;/bar&gt;
&lt;/mydoc&gt;
`
type MyDoc struct {
XMLName xml.Name `xml:&quot;mydoc&quot;`
Items   []Item
}
type Item interface {
IsItem()
}
type Foo struct {
XMLName xml.Name `xml:&quot;foo&quot;`
Name    string   `xml:&quot;,chardata&quot;`
}
func (f Foo) IsItem() {}
type Bar struct {
XMLName xml.Name `xml:&quot;bar&quot;`
Nombre  string   `xml:&quot;,chardata&quot;`
}
func (b Bar) IsItem() {}
func main() {
doMarshal()
doUnmarshal()
}
func doMarshal() {
myDoc := MyDoc{
Items: []Item{
Foo{Name: &quot;Foo&quot;},
Bar{Nombre: &quot;Bar&quot;},
Foo{Name: &quot;Another Foo&quot;},
Foo{Name: &quot;Foo #3&quot;},
Bar{Nombre: &quot;Bar 2&quot;},
},
}
bytes, err := xml.MarshalIndent(myDoc, &quot;&quot;, &quot;  &quot;)
if err != nil {
panic(err)
}
// Prints an XML document just like &quot;sampleXml&quot; above.
println(string(bytes))
}
func doUnmarshal() {
myDoc := MyDoc{}
err := xml.Unmarshal([]byte(sampleXml), &amp;myDoc)
if err != nil {
panic(err)
}
// Fails to unmarshal the &quot;Item&quot; elements into their respective structs.
fmt.Printf(&quot;ERR: %#v&quot;, myDoc)
}

You'll see that doMarshal() produces the exact XML document I expect; however, doUnmarshal() fails to deserialize the "Item" elements into their respective structs. I've tried a few changes but nothing seems to get them to unmarshal properly (creating storage for myDoc.Items, changing the type of "Items" to []*Item [and others], fiddling with the XML tags, etc).

Any ideas how to get xml.Unmarshal(...) to deserialize a list of elements of unrelated types?

答案1

得分: 4

正如其他评论所指出的,解码器无法处理没有一些帮助的接口字段。在容器上实现xml.Unmarshaller将使其按照您的要求进行操作(在playground上有一个完整的工作示例):

func (md *MyDoc) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
    md.XMLName = start.Name
    // 获取其他属性

    // 解码内部元素
    for {
        t, err := d.Token()
        if err != nil {
            return err
        }
        var i Item
        switch tt := t.(type) {
        case xml.StartElement:
            switch tt.Name.Local {
            case "foo":
                i = new(Foo) // 解码的项将是*Foo,而不是Foo!
            case "bar":
                i = new(Bar)
                // 默认情况:为简洁起见忽略
            }
            // 找到已知的子元素,对其进行解码
            if i != nil {
                err = d.DecodeElement(i, &tt)
                if err != nil {
                    return err
                }
                md.Items = append(md.Items, i)
                i = nil
            }
        case xml.EndElement:
            if tt == start.End() {
                return nil
            }
        }

    }
    return nil
}

这只是对@evanmcdonnal建议的实现。它的作用只是根据下一个Token的名称实例化正确的Item,然后调用d.DecodeElement()(即让xml解码器来完成繁重的工作)。

请注意,解码后的Items是指针。如果您想要值,您需要做更多的工作。这还需要进一步扩展以正确处理错误或意外的输入数据。

英文:

As pointed out by other comments, the decoder cannot deal with interface fields without some help. Implementing xml.Unmarshaller on the container will make it do what you want (full working example on the playground):

func (md *MyDoc) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
md.XMLName = start.Name
// grab any other attrs
// decode inner elements
for {
t, err := d.Token()
if err != nil {
return err
}
var i Item
switch tt := t.(type) {
case xml.StartElement:
switch tt.Name.Local {
case &quot;foo&quot;:
i = new(Foo) // the decoded item will be a *Foo, not Foo!
case &quot;bar&quot;:
i = new(Bar)
// default: ignored for brevity
}
// known child element found, decode it
if i != nil {
err = d.DecodeElement(i, &amp;tt)
if err != nil {
return err
}
md.Items = append(md.Items, i)
i = nil
}
case xml.EndElement:
if tt == start.End() {
return nil
}
}
}
return nil
}

This is just an implementation of what @evanmcdonnal suggests. All this does is instantiate the proper Item based on the name of the next Token, then call d.DecodeElement() with it (i.e. let the xml decoder do the heavy lifting).

Note that the unmarshalled Items are pointers. You'll need to do some more work if you want values. This also needs to be expanded some more for proper handling of errors or unexpected input data.

huangapple
  • 本文由 发表于 2015年11月3日 04:54:03
  • 转载请务必保留本文链接:https://go.coder-hub.com/33486725.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定