如何让golang区分带有命名空间和不带命名空间的XML元素?

huangapple go评论83阅读模式
英文:

How to let golang distinguish XML elements with and without namespaces?

问题

假设我有以下的XML数据:

<image> 
  <url>http://sampleUrl.com</url>
</image>

<itunes:image url="http://sampleItunesUrl.com" />  //xmlData

我使用以下结构体对其进行解码:

type Response struct {
   XMLName   xml.Name   `xml:"resp"`
   Image []struct {
        URL string   `xml:"url"`
  } `xml:"image"`
  ItunesImage  struct {
       URL string `xml:"url,attr"`
  }  `xml:"http://www.itunes.com/dtds/podcast-1.0.dtd image"`

}

我有如下代码:

var resp Response 
err := xml.Unmarshal([]byte(xmlData), &resp)
if err != nil {
	fmt.Printf("Error decoding XML: %s\n", err)
	return
}
for _, img := range resp.Image{
    if img.URL != "" {
        //<image>,而不是<itunes:image>
    }
}

来处理解码。

我遇到的问题是,Image []struct似乎将<image><itunes:image>都视为<image>元素,因为它们都有"image"。为了过滤掉<itunes:image>,我使用的方法是让Image []struct中的每个元素检查它们的url是否为空字符串(因为对于<itunes:image>,它的url是属性)。

另一种方法是编写自己的Unmarshal函数来区分带有和不带有itunes命名空间的XML元素。基本上,我希望Image []struct只包含<image>元素。

我想知道Go语言是否有一些内置的功能可以区分它们?还是我必须编写代码来过滤掉<itunes:image>

英文:

suppose I have the following XML data:

&lt;image&gt; 
  &lt;url&gt; http://sampleUrl.com
  &lt;/url&gt;
&lt;/image&gt;

&lt;itunes:image url=&quot;http://sampleItunesUrl.com&quot; /&gt;  //xmlData

I use this struct to decode it:

type Response struct {
   XMLName   xml.Name   `xml:&quot;resp&quot;`
   Image []struct {
        URL string   `xml:&quot;url&quot;`
  } `xml:&quot;image&quot;`
  ItunesImage  struct {
       URL string `xml:&quot;url,attr&quot;`
  }  `xml:&quot;http://www.itunes.com/dtds/podcast-1.0.dtd image&quot;`

}

and I have codes like:

var resp Response 
err := xml.Unmarshal([]byte(xmlData), &amp;resp)
if err != nil {
	fmt.Printf(&quot;Error decoding XML: %s\n&quot;, err)
	return
}
for _, img := range resp.Image{
    if img.URL != &quot;&quot; {
        //&lt;image&gt;, not &lt;itunes:image&gt;
    }
}

to process decoding

The problem I am having is that looks like Image []struct considers both &lt;image&gt; and &lt;itunes:image to be &lt;image&gt; element, since they both have "image". To filter out &lt;itunes:image, the approach I am using is to let each one in Image []struct examine if their url are empty strings (because for &lt;itunes:image its url is attribute).

Another approach is to write my own Unmarshal function to distinguish between XML elements with and without itunes namespace. Basically I want Image []struct to only hold elements &lt;image&gt;

I am wondering if go has some built-in functionalities to distinguish? Or I have to write my codes to filter out &lt;itunes:image&gt; ?

答案1

得分: 2

请注意,字段的顺序很重要。ItunesImage具有更具体的标签,因此应该在Image之前。

package main

import (
	"encoding/xml"
	"fmt"
)

func main() {
	type Response struct {
		XMLName     xml.Name `xml:"resp"`
		ItunesImage struct {
			URL string `xml:"url,attr"`
		} `xml:"http://www.itunes.com/dtds/podcast-1.0.dtd image"`
		Image []struct {
			URL string `xml:"url"`
		} `xml:"image"`
	}

	xmlData := `
<resp xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
  <image>
    <url>http://sampleUrl.com</url>
  </image>
  <itunes:image url="http://sampleItunesUrl.com/" />
</resp>
`

	var resp Response
	err := xml.Unmarshal([]byte(xmlData), &resp)
	if err != nil {
		fmt.Printf("解码XML时出错:%s\n", err)
		return
	}
	fmt.Printf("ItunesImage:%v\n", resp.ItunesImage)
	fmt.Printf("Images:%v\n", resp.Image)
}

如果你只需要从<image>标签中获取image/url,你可以像这样定义Response结构体:

type Response struct {
	XMLName xml.Name `xml:"resp"`
	Image   []string `xml:"image>url"`
}

关于标题中的一般问题(如何让golang区分带有命名空间和不带命名空间的XML元素),你可以使用名为XMLName的字段捕获元素名称,并检查该字段的Space成员。请参考下面的示例:

package main

import (
	"encoding/xml"
	"fmt"
)

func main() {
	type Response struct {
		XMLName xml.Name `xml:"resp"`
		Image   []struct {
			XMLName xml.Name
			URL     string `xml:"url"`
		} `xml:"image"`
	}

	xmlData := `
<resp xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd">
  <image>
    <url>http://sampleUrl.com</url>
  </image>
  <itunes:image>
    <url>http://sampleItunesUrl.com/</url>
  </itunes:image>
</resp>
`

	var resp Response
	err := xml.Unmarshal([]byte(xmlData), &resp)
	if err != nil {
		fmt.Printf("解码XML时出错:%s\n", err)
		return
	}
	for _, img := range resp.Image {
		fmt.Printf("命名空间:%q, URL:%s\n", img.XMLName.Space, img.URL)
	}
}

上述示例的输出结果为:

命名空间:"", URL:http://sampleUrl.com
命名空间:"http://www.itunes.com/dtds/podcast-1.0.dtd", URL:http://sampleItunesUrl.com/
英文:

Please note that the field order matters. ItunesImage has a more specific tag, so it should go before Image.

package main

import (
	&quot;encoding/xml&quot;
	&quot;fmt&quot;
)

func main() {
	type Response struct {
		XMLName     xml.Name `xml:&quot;resp&quot;`
		ItunesImage struct {
			URL string `xml:&quot;url,attr&quot;`
		} `xml:&quot;http://www.itunes.com/dtds/podcast-1.0.dtd image&quot;`
		Image []struct {
			URL string `xml:&quot;url&quot;`
		} `xml:&quot;image&quot;`
	}

	xmlData := `
&lt;resp xmlns:itunes=&quot;http://www.itunes.com/dtds/podcast-1.0.dtd&quot;&gt;
  &lt;image&gt;
    &lt;url&gt;http://sampleUrl.com&lt;/url&gt;
  &lt;/image&gt;
  &lt;itunes:image url=&quot;http://sampleItunesUrl.com/&quot; /&gt;
&lt;/resp&gt;
`

	var resp Response
	err := xml.Unmarshal([]byte(xmlData), &amp;resp)
	if err != nil {
		fmt.Printf(&quot;Error decoding XML: %s\n&quot;, err)
		return
	}
	fmt.Printf(&quot;ItunesImage: %v\n&quot;, resp.ItunesImage)
	fmt.Printf(&quot;Images: %v\n&quot;, resp.Image)
}

And if you only need image/url from the &lt;image&gt; tag, you could define the Response struct like this:

type Response struct {
	XMLName xml.Name `xml:&quot;resp&quot;`
	Image   []string `xml:&quot;image&gt;url&quot;`
}

Regarding the general question in the title (How to let golang distinguish XML elements with and without namespaces?), you can capture the element name with a field named XMLName and examine the Space member of this field. See the demo below:

package main

import (
	&quot;encoding/xml&quot;
	&quot;fmt&quot;
)

func main() {
	type Response struct {
		XMLName xml.Name `xml:&quot;resp&quot;`
		Image   []struct {
			XMLName xml.Name
			URL     string `xml:&quot;url&quot;`
		} `xml:&quot;image&quot;`
	}

	xmlData := `
&lt;resp xmlns:itunes=&quot;http://www.itunes.com/dtds/podcast-1.0.dtd&quot;&gt;
  &lt;image&gt;
    &lt;url&gt;http://sampleUrl.com&lt;/url&gt;
  &lt;/image&gt;
  &lt;itunes:image&gt;
    &lt;url&gt;http://sampleItunesUrl.com/&lt;/url&gt;
  &lt;/itunes:image&gt;
&lt;/resp&gt;
`

	var resp Response
	err := xml.Unmarshal([]byte(xmlData), &amp;resp)
	if err != nil {
		fmt.Printf(&quot;Error decoding XML: %s\n&quot;, err)
		return
	}
	for _, img := range resp.Image {
		fmt.Printf(&quot;namespace: %q, url: %s\n&quot;, img.XMLName.Space, img.URL)
	}
}

The output of the above demo is:

namespace: &quot;&quot;, url: http://sampleUrl.com
namespace: &quot;http://www.itunes.com/dtds/podcast-1.0.dtd&quot;, url: http://sampleItunesUrl.com/

huangapple
  • 本文由 发表于 2023年4月1日 06:39:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/75903249.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定