使用Go语言解析具有冲突元素名称的XML

huangapple go评论119阅读模式
英文:

Unmarshaling XML in Go with Conflicting Element Names

问题

我有以下的XML,是在我的组织之外定义的,我无法控制:

<foo>
  <bar>
    <zip>zip</zip>
  </bar>
  <bar>
    <zap>zap</zap>
  </bar>
</foo>

我正在使用以下的结构体:

type Foo struct {
	XMLName xml.Name `xml:"foo"`
	Bar1    Bar1
	Bar2    Bar2
}

type Bar1 struct {
	XMLName xml.Name `xml:"bar"`
	Zip     string   `xml:"zip"`
}

type Bar2 struct {
	XMLName xml.Name `xml:"bar"`
	Zap     string   `xml:"zap"`
}

由于冲突的 "bar" 名称,无法解组任何内容。我该如何填充 Bar1 和 Bar2 结构体?

这是我的代码:https://play.golang.org/p/D2IRLojcTB

这是我想要的结果:https://play.golang.org/p/Ytrbzzy9Ok

在第二个链接中,我将第二个 "bar" 改为 "bar1",这样就可以正常工作了。我希望找到一个更简洁的解决方案,而不是修改传入的XML。

英文:

I have the following XML, externally defined and outside of my organization's control:

&lt;foo&gt;
  &lt;bar&gt;
    &lt;zip&gt;zip&lt;/zip&gt;
  &lt;/bar&gt;
  &lt;bar&gt;
    &lt;zap&gt;zap&lt;/zap&gt;
  &lt;/bar&gt;
&lt;/foo&gt;

I am using these structs:

type Foo struct {
	XMLName xml.Name `xml:&quot;foo&quot;`
	Bar1    Bar1
	Bar2    Bar2
}

type Bar1 struct {
	XMLName xml.Name `xml:&quot;bar&quot;`
	Zip     string   `xml:&quot;zip&quot;`
}

type Bar2 struct {
	XMLName xml.Name `xml:&quot;bar&quot;`
	Zap     string   `xml:&quot;zap&quot;`
}

Because of the conflicting 'bar' name, nothing gets unmarshaled. How can I populate the Bar1 and Bar2 structs?

This is what I have: https://play.golang.org/p/D2IRLojcTB

This is the result I want: https://play.golang.org/p/Ytrbzzy9Ok

In the second one, I have updated the second 'bar' to be 'bar1,' and it all works. I'd rather come up with a cleaner solution that modifying the incoming XML.

答案1

得分: 11

encoding/xml包无法完全满足您的要求,因为它在遇到&lt;bar&gt;元素时会决定将其解码到Foo的哪个字段中,而不是在处理该元素的子元素时决定。您的结构定义使得这个决定变得模棱两可,正如xml.Unmarshal的错误所指示的那样:

> main.Foo字段"Bar1"与标签""冲突,与字段"Bar2"与标签""冲突。

以下是两个可行的替代方案:

1. 使用一个Bar结构来覆盖两个分支

如果您修改类型如下:

type Foo struct {
    XMLName xml.Name `xml:"foo"`
    Bars    []Bar    `xml:"bar"`
}

type Bar struct {
    Zip string `xml:"zip"`
    Zap string `xml:"zap"`
}

现在,您将获得一个表示所有&lt;bar&gt;元素的切片。您可以通过检查相应字段是否非空来判断元素是否具有&lt;zip&gt;&lt;zap&gt;元素。

您可以在此处尝试此版本:https://play.golang.org/p/kguPCYmKX0

2. 使用子选择器

如果您只对每个分支中的&lt;bar&gt;元素的单个子元素感兴趣,那么您可能根本不需要一个结构来表示该元素。例如,您可以解码为以下类型:

type Foo struct {
	XMLName xml.Name `xml:"foo"`
	Zip     string   `xml:"bar>zip"`
	Zap     string   `xml:"bar>zap"`
}

现在,&lt;bar&gt;元素的子元素将直接解码到Foo结构的成员中。请注意,使用此选项,您将无法区分所选输入与例如以下输入:

&lt;foo&gt;
  &lt;bar&gt;
    &lt;zip&gt;zip&lt;/zip&gt;
    &lt;zap&gt;zap&lt;/zap&gt;
  &lt;/bar&gt;
&lt;/foo&gt;

如果这会引起问题,那么您应该选择第一种解决方案。

您可以在此处尝试此版本:https://play.golang.org/p/fAE_HSrv4y

英文:

The encoding/xml package won't be able to do exactly what you want, since it makes the decision over which field of Foo to decode into when it encounters the &lt;bar&gt; element, rather than when processing children of that element. Your struct definitions make this decision ambiguous, as the error from xml.Unmarshal indicates:

> main.Foo field "Bar1" with tag "" conflicts with field "Bar2" with tag ""

Here are two alternatives that will work though:

1. Use one Bar struct to cover both branches

If you modify your types to read as:

type Foo struct {
    XMLName xml.Name `xml:&quot;foo&quot;`
    Bars    []Bar    `xml:&quot;bar&quot;`
}

type Bar struct {
    Zip string `xml:&quot;zip&quot;`
    Zap string `xml:&quot;zap&quot;`
}

You will now get a slice that represents all the &lt;bar&gt; elements. You can tell whether the element had a &lt;zip&gt; or &lt;zap&gt; element by checking whether the corresponding fields are non-empty.

You can try out this version here: https://play.golang.org/p/kguPCYmKX0

2. Use child selectors

If you are only interested in a single child element of &lt;bar&gt; in each branch, then you might not need a struct to represent that element at all. For example, you could decode into the following type:

type Foo struct {
	XMLName xml.Name `xml:&quot;foo&quot;`
	Zip     string   `xml:&quot;bar&gt;zip&quot;`
	Zap     string   `xml:&quot;bar&gt;zap&quot;`
}

Now the children of the &lt;bar&gt; elements will be decoded directly into members of the Foo struct. Note that with this option you won't be able to distinguish your chosen input from e.g.

&lt;foo&gt;
  &lt;bar&gt;
    &lt;zip&gt;zip&lt;/zip&gt;
    &lt;zap&gt;zap&lt;/zap&gt;
  &lt;/bar&gt;
&lt;/foo&gt;

If that will cause problems, then you should pick the first solution.

You can try out this version here: https://play.golang.org/p/fAE_HSrv4y

huangapple
  • 本文由 发表于 2015年3月10日 08:52:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/28954415.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定