英文:
How to convert complex Xml to csv?
问题
我正在用Java(初级阶段)编写程序,我真的需要在xslt转换方面帮助。需要从XML创建一个CSV文件。
我得到了这个xslt过滤器:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="node()" name="conv">
<xsl:call-template name="loop"/>
</xsl:template>
<xsl:template name="loop">
<xsl:for-each select="./*[count(*) = 0]">
<xsl:value-of select="."/>
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
<xsl:if test="position() = last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>
</xsl:text>
<xsl:for-each select="./*[(count(*) != 0) and (name()!='PARAMETRS')]">
<xsl:call-template name="loop"/>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:template>
</xsl:stylesheet>
源XML:
<Integration>
<PARAMETRS>
<ID>AZD</ID>
<DATE>2020-01-01</DATE>
</PARAMETRS>
<ORG>
<Thing>
<object>10220</object>
<type>U</type>
<dyn>
<items>
<val>988009</val>
<datebegin>2019-12-12</datebegin>
</items>
</dyn>
</Thing>
<Thing>
<object>10221</object>
<type>U</type>
<dyn>
<items>
<val>988010</val>
<datebegin>2019-12-13</datebegin>
</items>
<items>
<val>988011</val>
<datebegin>2019-12-14</datebegin>
</items>
</dyn>
</Thing>
</ORG>
</Integration>
在输出中,我得到了逗号分隔的行,以及几行(相同的items)的值在下面。我无法弄清楚如何连接这些值...
我会通过使用value-of select = "concat"来实现,但是我的
输出需要一个以逗号分隔的CSV。
请告知如何将项与其父项连接起来?或者是否有更简单的方法来解析具有不同数量子部分(子项)的XML?
预期输出:
10220,U,988009,2019-12-12
10221,U,988010,2019-12-13,988011,2019-12-14
英文:
I am writing a program in java( pre-junior), I really need help with xslt transformation. It is necessary to make a csv file from xml.
I got this xslt filter:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="node()" name="conv">
<xsl:call-template name="loop"/>
</xsl:template>
<xsl:template name="loop">
<xsl:for-each select="./*[count(*) = 0]">
<xsl:value-of select="."/>
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
<xsl:if test="position() = last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>&#xA;</xsl:text>
<xsl:for-each select="./*[(count(*) != 0) and (name()!='PARAMETRS')] ">
<xsl:call-template name="loop"/>
</xsl:for-each>
<xsl:text>&#xA;</xsl:text>
</xsl:template>
</xsl:stylesheet>
Source xml:
<Integration>
<PARAMETRS>
<ID>AZD</ID>
<DATE>2020-01-01</DATE>
</PARAMETRS>
<ORG>
<Thing>
<object>10220</object>
<type>U</type>
<dyn>
<items>
<val>988009</val>
<datebegin>2019-12-12</datebegin>
</items>
</dyn>
</Thing>
<Thing>
<object>10221</object>
<type>U</type>
<dyn>
<items>
<val>988010</val>
<datebegin>2019-12-13</datebegin>
</items>
<items>
<val>988011</val>
<datebegin>2019-12-14</datebegin>
</items>
</dyn>
</Thing>
</ORG>
</Integration>
In the output, I get comma-separated lines, and a few more lines (those same items) with the values below. and can't figure out how to concatenate the values ...
I would do it via value-of select = "concat" but my <items> may have several dyn (1, 2, 3 ...), hence this is not suitable.
The output needs a csv separated by commas.
Please advise how to concatenate the item with its parent? Or there are simpler ways to parse xml with a different number of subsections(childs).
Expected output:
10220,U,988009,2019-12-12
10221,U,988010,2019-12-13,988011,2019-12-14
答案1
得分: 0
以下是翻译好的内容:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/Integration">
<xsl:for-each select="ORG/Thing">
<xsl:value-of select="object"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="type"/>
<xsl:text>,</xsl:text>
<xsl:for-each select="dyn/items">
<xsl:value-of select="val"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="datebegin"/>
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
请注意,输出中每个 items
都有一组列;这不是理想的CSV结构。
英文:
The output you show can be easily obtained using the following stylesheet:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/Integration">
<xsl:for-each select="ORG/Thing">
<xsl:value-of select="object"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="type"/>
<xsl:text>,</xsl:text>
<xsl:for-each select="dyn/items">
<xsl:value-of select="val"/>
<xsl:text>,</xsl:text>
<xsl:value-of select="datebegin"/>
<xsl:if test="position() != last()">
<xsl:text>,</xsl:text>
</xsl:if>
</xsl:for-each>
<xsl:text>&#xA;</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
Note that the output has a set of columns for each items
; this is not an ideal CSV structure.
答案2
得分: 0
如果您能使用XSLT 2.0,它将带来新的强大功能。
Oracle XML Developer Kit(XDK)支持XSLT 2.0
这是链接:使用Java的XSLT处理器
下面的方法执行以下操作:
- 使用
string-join()
函数通过.//*/(text()[1]
表达式在不同层次结构级别上连接所有子元素值。 xs:token
强制转换删除空格。- XPath谓词
[. != '']
删除空序列成员。
XSLT 2.0
<?xml version='1.0'?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/Integration">
<xsl:for-each select="ORG/Thing">
<xsl:value-of select="string-join((.//*/(text()[1] cast as xs:token?))[. != ''],',')"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
输出
10220,U,988009,2019-12-12
10221,U,988010,2019-12-13,988011,2019-12-14
根据Marting Honnen的绝妙提示,这里是更简洁的XSLT 2.0版本,不需要任何循环。
XSLT 2.0
<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/Integration">
<xsl:value-of select="ORG/Thing/string-join((.//*/(text()[1] cast as xs:token?))[. != ''],',')" separator="
"/>
</xsl:template>
</xsl:stylesheet>
英文:
If you can use XSLT 2.0, it opens up new powerful functionality.
Oracle XML Developer Kit (XDK) supports XSLT 2.0
Here is the link: Using the XSLT Processor for Java
The approach below is doing the following:
- Using
string-join()
function to concatenate all child elements values
on a different hierarchy level via.//*/(text()[1]
expression. xs:token
casting removes white spaces.- XPath predicate
[. != '']
removes empty sequence members.
XSLT 2.0
<?xml version='1.0'?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/Integration">
<xsl:for-each select="ORG/Thing">
<xsl:value-of select="string-join((.//*/(text()[1] cast as xs:token?))[. != ''],',')"/>
<xsl:text>&#xA;</xsl:text>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
> Output
10220,U,988009,2019-12-12
10221,U,988010,2019-12-13,988011,2019-12-14
Based on the Marting Honnen great tip, here is even more concise XSLT 2.0 version without any loop.
XSLT 2.0
<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/Integration">
<xsl:value-of select="ORG/Thing/string-join((.//*/(text()[1] cast as xs:token?))[. != ''],',')" separator="&#xA;"/>
</xsl:template>
</xsl:stylesheet>
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论