英文:
Golang XML processing
问题
我正在尝试使用Go语言的标准encoding/xml包处理具有复杂结构的XML文件,更改一些节点的值并保存修改后的文件。例如:
<description>
<title-info>
<genre>Comedy</genre>
<author>
<first-name>Kevin</first-name>
<last-name>Smith</last-name>
</author>
<movie-title>Clerks</movie-title>
<annotation>
<p>!!!</p>
</annotation>
<keywords>comedy,jay,bob</keywords>
<date></date>
</title-info>
</description>
还有许多其他字段。我想要更改节点:
<author>
<first-name>Kevin</first-name>
<last-name>Smith</last-name>
</author>
为:
<author>
<first-name>K.</first-name>
<middle-name>Patrick</middle-name>
<last-name>Smith</last-name>
</author>
然而,由于文件非常庞大,并且使用了超过50个标签,我真的不想描述完整的结构来解析它们,所以我有以下结构:
type Result struct {
Title string `xml:"description>title-info>movie-title"`
Authors []Author `xml:"description>title-info>author"`
}
type Author struct {
Fname string `xml:"first-name"`
Mname string `xml:"middle-name"`
Lname string `xml:"last-name"`
}
对于我需要处理的字段,但我不知道如何保持文件的其余部分不变。看起来我需要使用xml.decode选择需要更改的节点(就像在http://blog.davidsingleton.org/parsing-huge-xml-files-with-go/帖子中所示),同时跳过不需要的标记到xml.encode,但我无法将这个问题转化为代码。
英文:
I am trying to process XML files with complicated structure in Go using the standart encoding/xml package, change values of couple of nodes and save the alternated file. For example:
<description>
<title-info>
<genre>Comedy</genre>
<author>
<first-name>Kevin</first-name>
<last-name>Smith</last-name>
</author>
<movie-title>Clerks</movie-title>
<annotation>
<p>!!!</p>
</annotation>
<keywords>comedy,jay,bob</keywords>
<date></date>
</description>
</title-info>
And many more fields. I would like to change the node:
<author>
<first-name>Kevin</first-name>
<last-name>Smith</last-name>
</author>
to
<author>
<first-name>K.</first-name>
<middle-name>Patrick</middle-name>
<last-name>Smith</last-name>
</author>
However, since files are massive and uses more then 50 tags I really don't want to describe the complete structure to unmarshal them, so I have
type Result struct {
Title string `xml:"description>title-info>movie-title"`
Authors []Author `xml:"description>title-info>author"`
}
type Author struct {
Fname string `xml:"first-name"`
Mname string `xml:"middle-name"`
Lname string `xml:"last-name"`
}
for fields I need to work with, but don't know how to keep rest of the file untouched. Looks like I need to use xml.decode to select the nodes I need to change (like at http://blog.davidsingleton.org/parsing-huge-xml-files-with-go/ post) while skipping unneeded tokens to xml.encode, but I can't convert this puzzle to some code.
答案1
得分: 3
你只能使用标准库是一个限制吗?
如果不是的话,我建议使用 etree(https://github.com/beevik/etree),它在标准库的 XML 处理之上提供了一个 DOM。它具有基本的 xpath 语法来选择节点,并且一旦你选择了节点,你可以很容易地对它们进行编辑。
英文:
Is it a constraint that you only use the standard library?
If not, I'd recommend etree (https://github.com/beevik/etree) which puts a DOM on top of the standard library's XML processing. It has a basic xpath syntax to select nodes, and you can easily edit them once you have them.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论