英文:
Read Complex Xml file in java
问题
try {
File fXmlFile = new File("filepath");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList billNodeList = doc.getElementsByTagName("ENVELOPE");
for(int i=0; i<billNodeList.getLength(); i++){
Node voucherNode = billNodeList.item(i);
Element voucherElement = (Element) voucherNode;
NodeList nList = voucherElement.getElementsByTagName("BILLFIXED");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node insideNode = nList.item(temp);
Element voucherElements = (Element) insideNode;
System.out.println(voucherElements.getElementsByTagName("BILLDATE").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLREF").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLPARTY").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLFINAL").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLOVERDUE").item(0).getTextContent());
}
System.out.println(voucherElement.getElementsByTagName("BILLCL").item(0).getTextContent());
System.out.println(voucherElement.getElementsByTagName("BILLPDC").item(0).getTextContent());
System.out.println(voucherElement.getElementsByTagName("BILLFINAL").item(0).getTextContent());
System.out.println(voucherElement.getElementsByTagName("BILLDUE").item(0).getTextContent());
System.out.println(voucherElement.getElementsByTagName("BILLOVERDUE").item(0).getTextContent());
}
} catch (Exception e) {
e.printStackTrace();
}
如果有任何问题,请随时问我。
英文:
I am able to read many type of xml file in java. but today i got a xml file and not able to read its details.
<ENVELOPE>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>1</BILLREF>
<BILLPARTY>Party1</BILLPARTY>
</BILLFIXED>
<BILLCL>-10800.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-10800.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>2</BILLREF>
<BILLPARTY>Party2</BILLPARTY>
</BILLFIXED>
<BILLCL>-2000.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-2000.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>3</BILLREF>
<BILLPARTY>Party3</BILLPARTY>
</BILLFIXED>
<BILLCL>-1416.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-1416.00</BILLFINAL>
<BILLDUE>31-Jul-2017</BILLDUE>
<BILLOVERDUE>0</BILLOVERDUE>
</ENVELOPE>
I am using this code for read xml file. I am able to read data inside <BILLFIXED>
tag but not able to read data outside of this like <BILLFINAL>
and <BILLDUE>
etc.
try {
File fXmlFile = new File("filepath");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
NodeList billNodeList = doc.getElementsByTagName("ENVELOPE");
for(int i=0;i<billNodeList.getLength();i++){
Node voucherNode = billNodeList.item(i);
Element voucherElement = (Element) voucherNode;
NodeList nList = voucherElement.getElementsByTagName("BILLFIXED");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node insideNode = nList.item(temp);
Element voucherElements = (Element) insideNode;
System.out.println(voucherElements.getElementsByTagName("BILLDATE").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLREF").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLPARTY").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLFINAL").item(0).getTextContent());
System.out.println(voucherElements.getElementsByTagName("BILLOVERDUE").item(0).getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
I am try all possible way which i know that but currently i am not able to find any solution.
If anyone have any solution please share with me.
答案1
得分: 1
一种方法是将 XML 进行“修复”,使其更具有良好的结构,例如以下方式:
// 修复 XML
Element envelopeElem = doc.getDocumentElement();
List<Node> children = new ArrayList<>();
for (Node child = envelopeElem.getFirstChild(); child != null; child = child.getNextSibling())
children.add(child);
Element billElem = null;
for (Node child : children) {
if (child.getNodeType() == Node.ELEMENT_NODE && "BILLFIXED".equals(child.getNodeName()))
envelopeElem.insertBefore(billElem = doc.createElement("BILL"), child);
if (billElem != null)
billElem.appendChild(child);
}
该代码基本上在遇到“BILLFIXED”元素时,创建一个新的<BILL>
元素作为<ENVELOPE>
的子元素,然后将所有后续节点移动到<BILL>
元素中。
结果是,DOM 树中的 XML 如下所示1,这应该更容易供您处理:
<ENVELOPE>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>1</BILLREF>
<BILLPARTY>Party1</BILLPARTY>
</BILLFIXED>
<BILLCL>-10800.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-10800.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
</BILL>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>2</BILLREF>
<BILLPARTY>Party2</BILLPARTY>
</BILLFIXED>
<BILLCL>-2000.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-2000.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
</BILL>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>3</BILLREF>
<BILLPARTY>Party3</BILLPARTY>
</BILLFIXED>
<BILLCL>-1416.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-1416.00</BILLFINAL>
<BILLDUE>31-Jul-2017</BILLDUE>
<BILLOVERDUE>0</BILLOVERDUE>
</BILL>
</ENVELOPE>
1) 为了方便人类阅读,已对 XML 进行了重新格式化,即重新缩进。
英文:
One way to do it, is to "fix" the XML to be more well-structured, e.g. like this:
// Fix the XML
Element envelopeElem = doc.getDocumentElement();
List<Node> children = new ArrayList<>();
for (Node child = envelopeElem.getFirstChild(); child != null; child = child.getNextSibling())
children.add(child);
Element billElem = null;
for (Node child : children) {
if (child.getNodeType() == Node.ELEMENT_NODE && "BILLFIXED".equals(child.getNodeName()))
envelopeElem.insertBefore(billElem = doc.createElement("BILL"), child);
if (billElem != null)
billElem.appendChild(child);
}
The code basically creates a new <BILL>
element as a child of <ENVELOPE>
whenever it encounters a <BILLFIXED>
element, then moves all subsequent nodes into the <BILL>
element.
The result is that the XML in the DOM tree looks like this<sup>1</sup>, which should be easier for you to process:
<ENVELOPE>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>1</BILLREF>
<BILLPARTY>Party1</BILLPARTY>
</BILLFIXED>
<BILLCL>-10800.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-10800.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
</BILL>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>2</BILLREF>
<BILLPARTY>Party2</BILLPARTY>
</BILLFIXED>
<BILLCL>-2000.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-2000.00</BILLFINAL>
<BILLDUE>1-Jul-2017</BILLDUE>
<BILLOVERDUE>30</BILLOVERDUE>
</BILL>
<BILL>
<BILLFIXED>
<BILLDATE>1-Jul-2017</BILLDATE>
<BILLREF>3</BILLREF>
<BILLPARTY>Party3</BILLPARTY>
</BILLFIXED>
<BILLCL>-1416.00</BILLCL>
<BILLPDC/>
<BILLFINAL>-1416.00</BILLFINAL>
<BILLDUE>31-Jul-2017</BILLDUE>
<BILLOVERDUE>0</BILLOVERDUE>
</BILL>
</ENVELOPE>
<sup>1) The XML has been reformatted for human readability, i.e. it has been re-indented.</sup>
答案2
得分: 0
这不是结构良好的XML。在您的<envelope>
标签内部,没有任何内容来表示构成一个'bill'的六个属性集合的开始。通常每个属性集合都应该有一个包含它们的<bill>
和</bill>
标签。这会让解析器混淆...
英文:
It isn't well-structured XML. Inside your <envelope>
tags there is nothing to indicate the start of each set of six attributes that constitute a 'bill'. You'd normally expect that each one would have a <bill>
and </bill>
tag to contain them. And this is going to confuse the parser...
答案3
得分: 0
根据示例 XML,它包含了3条记录的数据。但是每条记录之间没有任何分隔。看起来每个字段的数据都填充到了 XML 标签中并写入文件。
我建议有两个可能的选项:
- 基于 JAVA:正如 Andreas 所建议的,读取文件内容并为每条记录添加一个根标签,这将产生有限的 XML 结构,从而更容易处理。当输入文件很大时可能会对性能产生影响。
- 基于转换:尝试使用 STX 转换,它可以将结构转换为所需的格式,无论是 XML 还是平面文件。然后处理将会更简单。
英文:
As per sample XML, it has data for 3 records. But each record does not have any separation. Looks like each field data populated into XML tag and written into file.
There 2 possible option I would suggest
- JAVA based : As Andreas suggested, Read the file content and add a root tag for each record which would give finite XML structure then would be easier to handle. Performance impact may raise when the input file is in large size.
- Transformation based : Try STX transformation which would convert the structure to required format either XML or even flat file. Then processing would be simpler
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论