Golang XML处理

huangapple go评论84阅读模式
英文:

Golang XML processing

问题

我正在尝试使用Go语言的标准encoding/xml包处理具有复杂结构的XML文件,更改一些节点的值并保存修改后的文件。例如:

<description>
    <title-info>
        <genre>Comedy</genre>
        <author>
            <first-name>Kevin</first-name>
            <last-name>Smith</last-name>
        </author>
        <movie-title>Clerks</movie-title>
        <annotation>
            <p>!!!</p>
        </annotation>
        <keywords>comedy,jay,bob</keywords>
        <date></date>
    </title-info>
</description>

还有许多其他字段。我想要更改节点:

<author>
    <first-name>Kevin</first-name>
    <last-name>Smith</last-name>
</author>

为:

<author>
    <first-name>K.</first-name>
    <middle-name>Patrick</middle-name>
    <last-name>Smith</last-name>
</author>

然而,由于文件非常庞大,并且使用了超过50个标签,我真的不想描述完整的结构来解析它们,所以我有以下结构:

type Result struct {
    Title   string   `xml:"description>title-info>movie-title"`
    Authors []Author `xml:"description>title-info>author"`
}

type Author struct {
    Fname string `xml:"first-name"`
    Mname string `xml:"middle-name"`
    Lname string `xml:"last-name"`
}

对于我需要处理的字段,但我不知道如何保持文件的其余部分不变。看起来我需要使用xml.decode选择需要更改的节点(就像在http://blog.davidsingleton.org/parsing-huge-xml-files-with-go/帖子中所示),同时跳过不需要的标记到xml.encode,但我无法将这个问题转化为代码。

英文:

I am trying to process XML files with complicated structure in Go using the standart encoding/xml package, change values of couple of nodes and save the alternated file. For example:

&lt;description&gt;
    &lt;title-info&gt;
        &lt;genre&gt;Comedy&lt;/genre&gt;
        &lt;author&gt;
            &lt;first-name&gt;Kevin&lt;/first-name&gt;
            &lt;last-name&gt;Smith&lt;/last-name&gt;
        &lt;/author&gt;
        &lt;movie-title&gt;Clerks&lt;/movie-title&gt;
        &lt;annotation&gt;
            &lt;p&gt;!!!&lt;/p&gt;
        &lt;/annotation&gt;
        &lt;keywords&gt;comedy,jay,bob&lt;/keywords&gt;
        &lt;date&gt;&lt;/date&gt;
    &lt;/description&gt;
&lt;/title-info&gt;

And many more fields. I would like to change the node:

&lt;author&gt;
    &lt;first-name&gt;Kevin&lt;/first-name&gt;
    &lt;last-name&gt;Smith&lt;/last-name&gt;
&lt;/author&gt;

to

&lt;author&gt;
    &lt;first-name&gt;K.&lt;/first-name&gt;
    &lt;middle-name&gt;Patrick&lt;/middle-name&gt;
    &lt;last-name&gt;Smith&lt;/last-name&gt;
&lt;/author&gt;

However, since files are massive and uses more then 50 tags I really don't want to describe the complete structure to unmarshal them, so I have

type Result struct {
    Title   string   `xml:&quot;description&gt;title-info&gt;movie-title&quot;`
    Authors []Author `xml:&quot;description&gt;title-info&gt;author&quot;`
}

type Author struct {
    Fname string `xml:&quot;first-name&quot;`
    Mname string `xml:&quot;middle-name&quot;`
    Lname string `xml:&quot;last-name&quot;`
}

for fields I need to work with, but don't know how to keep rest of the file untouched. Looks like I need to use xml.decode to select the nodes I need to change (like at http://blog.davidsingleton.org/parsing-huge-xml-files-with-go/ post) while skipping unneeded tokens to xml.encode, but I can't convert this puzzle to some code.

答案1

得分: 3

你只能使用标准库是一个限制吗?

如果不是的话,我建议使用 etree(https://github.com/beevik/etree),它在标准库的 XML 处理之上提供了一个 DOM。它具有基本的 xpath 语法来选择节点,并且一旦你选择了节点,你可以很容易地对它们进行编辑。

英文:

Is it a constraint that you only use the standard library?

If not, I'd recommend etree (https://github.com/beevik/etree) which puts a DOM on top of the standard library's XML processing. It has a basic xpath syntax to select nodes, and you can easily edit them once you have them.

huangapple
  • 本文由 发表于 2015年6月29日 05:08:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/31104777.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定