2012年11月7日 16:49:31go评论90阅读模式

英文:

Why does Go's encoding/xml.Decoder.Token() not produce xml.Attr tokens as it should?

问题

使用encoding/xml.Decoder，我正在尝试手动解析从http://www.khronos.org/files/collada_schema_1_4加载的XML文件。

为了测试目的，我只是遍历文档并打印出遇到的任何令牌类型：

func Test (r io.Reader) {
	var t xml.Token
	var pa *xml.Attr
	var a xml.Attr
	var co xml.Comment
	var cd xml.CharData
	var se xml.StartElement
	var pi xml.ProcInst
	var ee xml.EndElement
	var is bool
	var xd = xml.NewDecoder(r)
	for i := 0; i < 24; i++ {
		if t, err = xd.Token(); (err == nil) && (t != nil) {
			if a, is = t.(xml.Attr); is { print("ATTR\t"); println(a.Name.Local) }
			if pa, is = t.(*xml.Attr); is { print("*ATTR\t"); println(pa) }
			if co, is = t.(xml.Comment); is { print("COMNT\t"); println(co) }
			if cd, is = t.(xml.CharData); is { print("CDATA\t"); println(cd) }
			if pi, is = t.(xml.ProcInst); is { print("PROCI\t"); println(pi.Target) }
			if se, is = t.(xml.StartElement); is { print("START\t"); println(se.Name.Local) }
			if ee, is = t.(xml.EndElement); is { print("END\t\t"); println(ee.Name.Local) }
		}
	}
}

现在这是输出：

PROCI	xml
CDATA	[1/64]0xf84004e050
START	schema
CDATA	[2/129]0xf84004d090
COMNT	[29/129]0xf84004d090
CDATA	[2/129]0xf84004d090
START	annotation
CDATA	[3/129]0xf84004d090
START	documentation
CDATA	[641/1039]0xf840061000
END		documentation
CDATA	[2/1039]0xf840061000
END		annotation
CDATA	[2/1039]0xf840061000
COMNT	[37/1039]0xf840061000
CDATA	[2/1039]0xf840061000
START	import
END		import
CDATA	[2/1039]0xf840061000
COMNT	[14/1039]0xf840061000
CDATA	[2/1039]0xf840061000
START	element
CDATA	[3/1039]0xf840061000
START	annotation

注意，即使在最后（第24行），已经传递了许多属性，但没有输出ATTR或*ATTR行。

这是在Windows 7 64位下的Go 1.0.3 64位中进行的。我是做错了什么还是应该提交一个Go软件包错误报告？

[附注：当对正确准备的结构进行正常的xml.Unmarshal时，xml包可以很好地捕获和映射已知命名和映射的属性。但我还需要收集根元素中的“未知”属性（为了收集此用例的命名空间信息，该用例是http://github.com/metaleap/go-xsd），因此我尝试使用Decoder.Token()。]

英文:

Using encoding/xml.Decoder I'm attempting to manually parse an XML file loaded from http://www.khronos.org/files/collada_schema_1_4

For test purposes, I'm just iterating over the document printing out whatever token type is encountered:

func Test (r io.Reader) {
	var t xml.Token
	var pa *xml.Attr
	var a xml.Attr
	var co xml.Comment
	var cd xml.CharData
	var se xml.StartElement
	var pi xml.ProcInst
	var ee xml.EndElement
	var is bool
	var xd = xml.NewDecoder(r)
	for i := 0; i &lt; 24; i++ {
		if t, err = xd.Token(); (err == nil) &amp;&amp; (t != nil) {
			if a, is = t.(xml.Attr); is { print(&quot;ATTR\t&quot;); println(a.Name.Local) }
			if pa, is = t.(*xml.Attr); is { print(&quot;*ATTR\t&quot;); println(pa) }
			if co, is = t.(xml.Comment); is { print(&quot;COMNT\t&quot;); println(co) }
			if cd, is = t.(xml.CharData); is { print(&quot;CDATA\t&quot;); println(cd) }
			if pi, is = t.(xml.ProcInst); is { print(&quot;PROCI\t&quot;); println(pi.Target) }
			if se, is = t.(xml.StartElement); is { print(&quot;START\t&quot;); println(se.Name.Local) }
			if ee, is = t.(xml.EndElement); is { print(&quot;END\t\t&quot;); println(ee.Name.Local) }
		}
	}
}

Now here's the output:

PROCI	xml
CDATA	[1/64]0xf84004e050
START	schema
CDATA	[2/129]0xf84004d090
COMNT	[29/129]0xf84004d090
CDATA	[2/129]0xf84004d090
START	annotation
CDATA	[3/129]0xf84004d090
START	documentation
CDATA	[641/1039]0xf840061000
END		documentation
CDATA	[2/1039]0xf840061000
END		annotation
CDATA	[2/1039]0xf840061000
COMNT	[37/1039]0xf840061000
CDATA	[2/1039]0xf840061000
START	import
END		import
CDATA	[2/1039]0xf840061000
COMNT	[14/1039]0xf840061000
CDATA	[2/1039]0xf840061000
START	element
CDATA	[3/1039]0xf840061000
START	annotation

Notice no ATTR or *ATTR lines are output even though by the last (24th) line many attributes have been passed both in the root xs:schema element as well as in xs:import and xs:element elements.

This is in Go 1.0.3 64-bit under Windows 7 64-bit. Am I doing something wrong or should I file a Go package bug report?

[Side note: when doing a normal xml.Unmarshal into properly prepared structs, known-named-and-mapped attributes are captured and mapped by the xml package just fine. But I also need to collect "unknown" attributes in the root element (to collect namespace information for this use-case, the use-case being http://github.com/metaleap/go-xsd ), hence my attempts to use Decoder.Token().]

答案1

得分: 5

是的，这种行为是预期的。属性被解析，但不作为xml.Token返回。属性只是不是Tokens。参见：http://golang.org/pkg/encoding/xml/#Token

可以通过Token StartElement中的Attr字段访问属性。参见：http://golang.org/pkg/encoding/xml/#StartElement

（一些一般提示：

a）不要使用print或println。

b）a, ok := t.(SomeType)的习惯用法被称为“逗号好”，因为布尔值通常被命名为“ok”，而不是“is”。请遵循这些约定。

c）习惯用法可能是这样的

switch t := t.(type) {
  case xml.StartElement: ...
  case xml.EndElement: ...
}

而不是你的“if a, is = t.(xml.Attr) ...”列表。

d）所有这些“var se xml.StartElement”都是噪音（杂乱）。使用

if se, ok := t.(xml.StartElement); ok { ... }

这将使您的代码更易读。）

英文:

Yes, this behavior is expected. The attributes are parsed, but
not returned as a xml.Token. Attributes simply arn't Tokens.
See: http://golang.org/pkg/encoding/xml/#Token

The attributes are accessible through the Attr field in
the Token StartElement.
See: http://golang.org/pkg/encoding/xml/#StartElement

(( Some general hints:

a) Do not use print or println.

b) The a, ok := t.(SomeType) idioma is called "comma okay", because the boolean is normaly named "ok", not "is". Please stick to these conventions.

c) Idiomatic would be something like

switch t := t.(type) {
  case xml.StartElement: ...
  case xml.EndElement: ...
}

instead of your list of "if a, is = t.(xml.Attr) ..."

d) All this "var se xml.StartElement" is noise (clutter). Use

if se, ok := t.(xml.StartElement); ok { ... }

This would make your code much readable. ))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Why does Go's encoding/xml.Decoder.Token() not produce xml.Attr tokens as it should?

问题

答案1

我可以将快捷键绑定到快速修复特定的弱警告吗？

为什么我在运行StAX解析器时会得到空指针异常？

Golang在MacOS中从哪里获取根证书颁发机构（CAs）？

How do you escape raw HTML in Go?

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论