如何从 XML 的根元素中获取一级评论。

huangapple go评论88阅读模式
英文:

How to get 1st-level comment from root element of xml

问题

我的xsd文件具有以下结构:

<?xml version="1.0" encoding="UTF-8"?>
<!-- EIS docs-ws-api集成方案,版本6.4,创建日期15.11.2016 -->
<someTag></someTag>

如何获取这个注释?
我尝试在playground上做了一下,链接是https://play.golang.org/p/PVHux_Gvb7

英文:

my xsd file has following structure:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;!-- EIS docs-ws-api Integration Scheme, version 6.4, create date 15.11.2016 --&gt;
&lt;someTag&gt;&lt;/someTag&gt;

how to get this comment?
i try did it on playground https://play.golang.org/p/PVHux_Gvb7

答案1

得分: 2

如其他答案中所提到的,xml.Unmarshal 只能解析作为 XML 元素一部分的注释。

对于你的情况,可以使用外部库 xmlpath,该库实现了 XPath 规范,可能对你有用。

> 安装:go get gopkg.in/xmlpath.v1

让我们从你的示例 XML 中提取名为 someTag 的标签的 preceding 注释。

import (
	"fmt"
	"log"

	"strings"

	xmlpath "gopkg.in/xmlpath.v1"
)

func main() {
	data := `
<?xml version="1.0" encoding="UTF-8"?>
<!-- EIS docs-ws-api Integration Scheme, version 6.4, create date 15.11.2016 -->
<someTag></someTag>
`
	path := xmlpath.MustCompile("/someTag/preceding::comment()")
	root, err := xmlpath.Parse(strings.NewReader(data))
	if err != nil {
		log.Fatal(err)
	}
	if comment, ok := path.String(root); ok {
		fmt.Println(comment)
	}
}
英文:

As mentioned in other answer, xml.Unmarshal can parse comment only if it's part of XML element.

External library xmlpath which implements XPath specification can be useful in your case.

> Install: go get gopkg.in/xmlpath.v1

Let's extract preceding comment of tag named someTag from your example xml.

import (
	&quot;fmt&quot;
	&quot;log&quot;

	&quot;strings&quot;

	xmlpath &quot;gopkg.in/xmlpath.v1&quot;
)

func main() {
	data := `
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;!-- EIS docs-ws-api Integration Scheme, version 6.4, create date 15.11.2016 --&gt;
&lt;someTag&gt;&lt;/someTag&gt;
`
	path := xmlpath.MustCompile(&quot;/someTag/preceding::comment()&quot;)
	root, err := xmlpath.Parse(strings.NewReader(data))
	if err != nil {
		log.Fatal(err)
	}
	if comment, ok := path.String(root); ok {
		fmt.Println(comment)
	}
}

答案2

得分: 1

简短回答:上面的注释不在root元素内部。它在root元素之外,因此xml.Unmarshal无法读取它。

解释

每个XML文档都有一个根元素。它包含所有其他元素,因此是所有其他元素的唯一父元素。根元素也称为文档元素。

根据encoding/xml的文档:

  • 如果XML元素包含注释,它们将累积在具有标签",comment"的第一个结构字段中。该结构字段可以是[]byte或string类型。如果没有这样的字段,则注释将被丢弃。

由于注释位于所有元素之外,您可能无法将其解码为结构体。

下面是一个扩展的XML文档示例,演示了rootElement元素和头部。

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example [
<!ENTITY copy "&#xA9;">
<rootElement attribute="xyz">
   <contentElement/>
</rootElement>
<!-- comment nodes may appear almost anywhere -->

这是W3C关于XML的当前(2023年1月)标准:链接

英文:

SHORT ANSWER : The comment above is not in root element.It is outside of root element so xml.Unmarshal could not read it

EXPLANATION

> Each XML document has exactly one single root element. It encloses all
> the other elements and is therefore the sole parent element to all the
> other elements. ROOT elements are also called document elements.

according to doc for encoding/xml

> * If the XML element contains comments, they are accumulated in the first struct field that has tag ",comment". The struct field may
> have type []byte or string. If there is no such field, the comments
> are discarded.

Since the comment is outside all the elements.You may not decode it to a struct

An expanded example of an XML document follows, demonstrating rootElement element and headers.

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;!DOCTYPE example [&lt;!ENTITY copy &quot;&amp;#xA9;&quot;&gt;    
&lt;rootElement attribute=&quot;xyz&quot;&gt;
   &lt;contentElement/&gt;
&lt;/rootElement&gt;
&lt;!-- comment nodes may appear almost anywhere --&gt;

Here is the Current(Jan-2017) standard for XML accroding to W3C : link

huangapple
  • 本文由 发表于 2017年1月27日 04:32:50
  • 转载请务必保留本文链接:https://go.coder-hub.com/41882452.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定