英文:
How to get 1st-level comment from root element of xml
问题
我的xsd文件具有以下结构:
<?xml version="1.0" encoding="UTF-8"?>
<!-- EIS docs-ws-api集成方案,版本6.4,创建日期15.11.2016 -->
<someTag></someTag>
如何获取这个注释?
我尝试在playground上做了一下,链接是https://play.golang.org/p/PVHux_Gvb7
英文:
my xsd file has following structure:
<?xml version="1.0" encoding="UTF-8"?>
<!-- EIS docs-ws-api Integration Scheme, version 6.4, create date 15.11.2016 -->
<someTag></someTag>
how to get this comment?
i try did it on playground https://play.golang.org/p/PVHux_Gvb7
答案1
得分: 2
如其他答案中所提到的,xml.Unmarshal
只能解析作为 XML 元素一部分的注释。
对于你的情况,可以使用外部库 xmlpath
,该库实现了 XPath 规范,可能对你有用。
> 安装:go get gopkg.in/xmlpath.v1
让我们从你的示例 XML 中提取名为 someTag
的标签的 preceding
注释。
import (
"fmt"
"log"
"strings"
xmlpath "gopkg.in/xmlpath.v1"
)
func main() {
data := `
<?xml version="1.0" encoding="UTF-8"?>
<!-- EIS docs-ws-api Integration Scheme, version 6.4, create date 15.11.2016 -->
<someTag></someTag>
`
path := xmlpath.MustCompile("/someTag/preceding::comment()")
root, err := xmlpath.Parse(strings.NewReader(data))
if err != nil {
log.Fatal(err)
}
if comment, ok := path.String(root); ok {
fmt.Println(comment)
}
}
英文:
As mentioned in other answer, xml.Unmarshal
can parse comment only if it's part of XML element.
External library xmlpath
which implements XPath specification can be useful in your case.
> Install: go get gopkg.in/xmlpath.v1
Let's extract preceding
comment of tag named someTag
from your example xml.
import (
"fmt"
"log"
"strings"
xmlpath "gopkg.in/xmlpath.v1"
)
func main() {
data := `
<?xml version="1.0" encoding="UTF-8"?>
<!-- EIS docs-ws-api Integration Scheme, version 6.4, create date 15.11.2016 -->
<someTag></someTag>
`
path := xmlpath.MustCompile("/someTag/preceding::comment()")
root, err := xmlpath.Parse(strings.NewReader(data))
if err != nil {
log.Fatal(err)
}
if comment, ok := path.String(root); ok {
fmt.Println(comment)
}
}
答案2
得分: 1
简短回答:上面的注释不在root
元素内部。它在root
元素之外,因此xml.Unmarshal
无法读取它。
解释:
每个XML文档都有一个根元素。它包含所有其他元素,因此是所有其他元素的唯一父元素。根元素也称为文档元素。
根据encoding/xml的文档:
- 如果XML元素包含注释,它们将累积在具有标签",comment"的第一个结构字段中。该结构字段可以是[]byte或string类型。如果没有这样的字段,则注释将被丢弃。
由于注释位于所有元素之外,您可能无法将其解码为结构体。
下面是一个扩展的XML文档示例,演示了rootElement元素和头部。
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example [
<!ENTITY copy "©">
<rootElement attribute="xyz">
<contentElement/>
</rootElement>
<!-- comment nodes may appear almost anywhere -->
这是W3C关于XML的当前(2023年1月)标准:链接
英文:
SHORT ANSWER : The comment above is not in root
element.It is outside of root element so xml.Unmarshal
could not read it
EXPLANATION
> Each XML document has exactly one single root element. It encloses all
> the other elements and is therefore the sole parent element to all the
> other elements. ROOT elements are also called document elements.
according to doc for encoding/xml
> * If the XML element contains comments, they are accumulated in the first struct field that has tag ",comment". The struct field may
> have type []byte or string. If there is no such field, the comments
> are discarded.
Since the comment is outside all the elements.You may not decode it to a struct
An expanded example of an XML document follows, demonstrating rootElement element and headers.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE example [<!ENTITY copy "&#xA9;">
<rootElement attribute="xyz">
<contentElement/>
</rootElement>
<!-- comment nodes may appear almost anywhere -->
Here is the Current(Jan-2017) standard for XML accroding to W3C : link
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论