Java Stax如何获取特定子节点的值

huangapple go评论77阅读模式
英文:

Java Stax how to get only value of specific child nodes

问题

我使用Stax来获取我的XML文件(大小为90MB)中的nodeName和nodeValue:

<?xml version="1.0" encoding="UTF-8"?>
<name1>
    <type>
        <coord>67</coord>
        <umc>57657</umc>
    </type>
    <lang>
        <eng>989</eng>
        <spa>123</spa>
    </lang>
</name1>
<name2>
    <type>
        <coord>534</coord>
        <umc>654654</umc>
    </type>
    <lang>
        <eng>354</eng>
        <spa>2424</spa>
    </lang>
</name2>
<name3>
    <type>
        <coord>23432</coord>
        <umc>14324</umc>
    </type>
    <lang>
        <eng>141</eng>
        <spa>142</spa>
    </lang>
</name3>

我可以获取localName,但不能获取子节点的内容...如果我想获取所有不是'spa'的子节点的值,我该如何处理?

Java代码:

XMLStreamReader dataXML = factory.createXMLStreamReader(new FileReader(path));
while (dataXML.hasNext())
{
    int type = dataXML.next();
    switch(type)
    {
        case XMLStreamReader.START_ELEMENT:
             System.out.println(dataXML.getLocalName());
             break;

        case XMLStreamReader.CHARACTERS:
             System.out.println(dataXML.getText());
             break;
     }
}
英文:

I use Stax for get nodeName and nodeValue of my xml file (size 90 MB) :

<?xml version="1.0" encoding="UTF-8"?>
<name1>
    <type>
        <coord>67</coord>
        <umc>57657</umc>
    </type>
    <lang>
        <eng>989</eng>
        <spa>123</spa>
    </lang>
</name1>
<name2>
    <type>
        <coord>534</coord>
        <umc>654654</umc>
    </type>
    <lang>
        <eng>354</eng>
        <spa>2424</spa>
    </lang>
</name2>
<name3>
    <type>
        <coord>23432</coord>
        <umc>14324</umc>
    </type>
    <lang>
        <eng>141</eng>
        <spa>142</spa>
    </lang>
</name3>

I can get localName but not child nodes... if I want to get the value for all child nodes different of 'spa' how can I process to get that ?

Java:

XMLStreamReader dataXML = factory.createXMLStreamReader(new FileReader(path));
while (dataXML.hasNext())
{
    int type = dataXML.next();
    switch(type)
    {
        case XMLStreamReader.START_ELEMENT:
             System.out.println(dataXML.getLocalName());
             break;

        case XMLStreamReader.CHARACTERS:
             System.out.println(dataXML.getText());
             break;
     }
}

答案1

得分: 0

你使用 StAX 解析。这意味着您从解析器中提取事件。StAX 解析对于文档的详细结构没有任何信息。

请查看DOM、SAX 或 StAX 之间的差异Java StAX 解析器

如果您想获得 XML 元素的子元素,您需要自行跟踪。

如果您真的希望以便捷的方式访问子元素 - 使用 DOM 解析策略。但正如您提到的,您的文档大小约为 90MB,这可能会导致完全加载文档变得非常繁重。

英文:

You use StAX parsing. It means You pull events from a parser. StAX parsing doesn't have any information about detail structure of Your document.
Please check Differences between DOM, SAX or StAX and Java StAX parser

If You want to get children of Your XML element, You need to track it by Yourself.

If You really want children being accessed in a convenient way - use DOM parsing strategy. But as You've mentioned, Your document is ~90MB what may be really heavy to load it fully.

答案2

得分: 0

为了跟踪被解析的元素,需要引入一个变量来保存当前标签名,以及一个保存感兴趣的标签名的变量:

String localname = null;
String tagName = "spa";

while (dataXML.hasNext()) {
    int type = dataXML.next();
    switch (type) {

        case XMLStreamReader.SPACE:
            continue;

        case XMLStreamReader.START_ELEMENT:
            localname = dataXML.getLocalName();
            System.out.println(dataXML.getLocalName());
            break;

        case XMLStreamReader.CHARACTERS:
            if (!tagName.equals(localname)) {
                System.out.println(dataXML.getText());
            }
            break;
    }
}

如果有多个要处理的标签,变量tagName可以替换为列表:

List<String> tagNames = new ArrayList<>();
tagNames.add("spa");

检查部分将如下所示:

if (!tagNames.contains(localname)) {
    System.out.println(dataXML.getText());
}
英文:

To keep track of element being parsed it's needed to introduce variable holding the current tag name as well as the variable with the tag name(s) of interest:

   String localname = null;
   String tagName = &quot;spa&quot;;

    while (dataXML.hasNext()) {
        int type = dataXML.next();
        switch (type) {

            case XMLStreamReader.SPACE:
                continue;

            case XMLStreamReader.START_ELEMENT:
                localname = dataXML.getLocalName();
                System.out.println(dataXML.getLocalName());
                break;

            case XMLStreamReader.CHARACTERS:
                if (!tagName.equals(localname)) {
                    System.out.println(dataXML.getText());
                }
                break;
        }
    }

In case there are several tags you want to handle, variable tagName could be replaced with a list:

List&lt;String&gt; tagNames = new ArrayList&lt;&gt;();
tagNames.add(&quot;spa&quot;);

And the check would be following:

if (!tagNames.contains(localname)) {
    System.out.println(dataXML.getText());
}

huangapple
  • 本文由 发表于 2020年9月20日 06:09:27
  • 转载请务必保留本文链接:https://go.coder-hub.com/63973809.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定