使用gokogiri(libxml2)如何解析带有命名空间的XML?

huangapple go评论79阅读模式
英文:

How do I parse xml with a namespace using gokogiri (libxml2)?

问题

我正在使用github.com/moovweb/gokogiri来解析一个XML文档。当解析var b时,以下代码可以正常工作,但是当我尝试在具有命名空间的var a上执行相同的操作时,没有输出。如何使用gokogiri解析具有命名空间的XML?

package main

import (
	"github.com/moovweb/gokogiri"
	"github.com/moovweb/gokogiri/xpath"
	"log"
)

func main() {
	log.SetFlags(log.Lshortfile)
	doc, _ := gokogiri.ParseXml([]byte(a))
	defer doc.Free()
	doc.SetNamespace("", "http://example.com/this")
	x := xpath.Compile(".//NodeA/NodeB")
	groups, err := doc.Search(x)
	if err != nil {
		log.Println(err)
	}
	for i, group := range groups {
		log.Println(i, group)
	}
}

var a = `<?xml version="1.0" ?><NodeA xmlns="http://example.com/this"><NodeB>thisthat</NodeB></NodeA>`
var b = `<?xml version="1.0" ?><NodeA><NodeB>thisthat</NodeB></NodeA>`

编辑 #1:
我还尝试了doc.RegisterNamespace,但是得到了以下错误:

doc.RegisterNamespace undefined (type *xml.XmlDocument has no field or method RegisterNamespace)

以及x.RegisterNamespace得到以下错误:

x.RegisterNamespace undefined (type *xpath.Expression has no field or method RegisterNamespace)

英文:

I am using github.com/moovweb/gokogiri to parse an XML document. The following works when parsing var b but when I try the same on var a (which has a namespace) I get no output. How do I parse XML that has a namespace using gokogiri?

package main

import (
	&quot;github.com/moovweb/gokogiri&quot;
	&quot;github.com/moovweb/gokogiri/xpath&quot;
	&quot;log&quot;
)

func main() {
	log.SetFlags(log.Lshortfile)
	doc, _ := gokogiri.ParseXml([]byte(a))
	defer doc.Free()
	doc.SetNamespace(&quot;&quot;, &quot;http://example.com/this&quot;)
	x := xpath.Compile(&quot;.//NodeA/NodeB&quot;)
	groups, err := doc.Search(x)
	if err != nil {
		log.Println(err)
	}
	for i, group := range groups {
		log.Println(i, group)
	}
}

var a = `&lt;?xml version=&quot;1.0&quot; ?&gt;&lt;NodeA xmlns=&quot;http://example.com/this&quot;&gt;&lt;NodeB&gt;thisthat&lt;/NodeB&gt;&lt;/NodeA&gt;`
var b = `&lt;?xml version=&quot;1.0&quot; ?&gt;&lt;NodeA&gt;&lt;NodeB&gt;thisthat&lt;/NodeB&gt;&lt;/NodeA&gt;`

EDIT #1:
I've also tried doc.RegisterNamespace but got

> doc.RegisterNamespace undefined (type *xml.XmlDocument has no field or method RegisterNamespace)"

and x.RegisterNamespace getting

> x.RegisterNamespace undefined (type *xpath.Expression has no field or method RegisterNamespace)"

答案1

得分: 7

即使在XML中使用的命名空间没有分配前缀(即默认情况下),您仍然需要注册一个前缀并在xpath表达式中使用它。

这个前缀可以是任何你喜欢的,这里我使用了ns。请注意,它可以与文档中使用的前缀(如果有)不同 - 需要匹配的是命名空间字符串本身。


示例:

package main

import (
    "fmt"
    "github.com/moovweb/gokogiri"
    "github.com/moovweb/gokogiri/xpath"
)

func main() {
    doc, _ := gokogiri.ParseXml([]byte(a))
    defer doc.Free()
    xp := doc.DocXPathCtx()
    xp.RegisterNamespace("ns", "http://example.com/this")
    x := xpath.Compile("/ns:NodeA/ns:NodeB")
    groups, err := doc.Search(x)
    if err != nil {
        fmt.Println(err)
    }
    for i, group := range groups {
        fmt.Println(i, group.Content())
    }
}

var a = `<?xml version="1.0" ?><NodeA xmlns="http://example.com/this"><NodeB>thisthat</NodeB></NodeA>`

输出:

0 thisthat
英文:

Even though the namespace used in the XML is assigned no prefix (i.e. is default), you do need to register one and use it in your xpath expression.

This prefix can be anything you like, here I used ns. Note it can be different from the prefix used in the document (if any) - the important part that needs to match is the namespace string itself.


Example:

package main

import (
	&quot;fmt&quot;
	&quot;github.com/moovweb/gokogiri&quot;
	&quot;github.com/moovweb/gokogiri/xpath&quot;
)

func main() {
	doc, _ := gokogiri.ParseXml([]byte(a))
	defer doc.Free()
	xp := doc.DocXPathCtx()
	xp.RegisterNamespace(&quot;ns&quot;, &quot;http://example.com/this&quot;)
	x := xpath.Compile(&quot;/ns:NodeA/ns:NodeB&quot;)
	groups, err := doc.Search(x)
	if err != nil {
		fmt.Println(err)
	}
	for i, group := range groups {
		fmt.Println(i, group.Content())
	}
}

var a = `&lt;?xml version=&quot;1.0&quot; ?&gt;&lt;NodeA xmlns=&quot;http://example.com/this&quot;&gt;&lt;NodeB&gt;thisthat&lt;/NodeB&gt;&lt;/NodeA&gt;`

Output:

0 thisthat

huangapple
  • 本文由 发表于 2014年12月15日 05:19:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/27474239.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定