英文:
Groovy XML to CSV with deep XML structure
问题
我需要一个Groovy脚本,将XML结构转换为CSV文件,不包括头部行。
XML的结构如下:
<stock>
    <field1>Header</field1>
    <field2>H1</field2>
    <positions>
        <data1>Hello</data1>
        <data2>P1</data2>
    </positions>
    <positions>
        <data1>World</data1>
        <data2>P2</data2>
    </positions>
</stock>
<stock>
    <field1>Header</field1>
    <field2>H2</field2>
    <positions>
        <data1>Hello</data1>
        <data2>P3</data2>
    </positions>
    <positions>
        <data1>World</data1>
        <data2>P4</data2>
    </positions>
</stock>
你尝试的代码似乎没有达到你的需求。CSV应该如下所示:
Header|H1
Hello|P1
World|P2
Header|H2
Hello|P3
World|P4
欢迎任何帮助。谢谢。
英文:
I need a groovy script that is transforming a XML structure into a CSV File without a header line.
XML lookes like this:
	<stock>
	        <field1>Header</field1>
                <field2>H1</field2>
		<positions>
	          <data1>Hello</data1>
	          <data2>P1</data2>
		</positions>
		<positions>
	          <data1>World</data1>
	          <data2>P2</data2>
		</positions>
	</stock>
	<stock>
	        <field1>Header</field1>
                <field2>H2</field2>
		<positions>
	          <data1>Hello</data1>
	          <data2>P3</data2>
		</positions>
		<positions>
	          <data1>World</data1>
	          <data2>P4</data2>
		</positions>
	</stock>
I tried this but it is not doing what i need to have.
		//parse input
		def parsedXml = new XmlParser().parseText(ins)
		
		def content = new XmlSlurper().parseText(ins)
		def csv = content.stock.positions.inject(header){ result, row ->
		 [result, row.children().collect().join('|')].join("\n")
CSV should look like this
Header|H1
Hello|P1
World|P2
Header|H2
Hello|P3
World|P4
any help is appreciated. thanks
答案1
得分: 0
这不是特别高效,因为它将所有内容读入内存,但这会起作用:
String xmlString = "<stocks><stock> <field1>Header</field1> <field2>H1</field2> <positions> <data1>Hello</data1> <data2>P1</data2> </positions> <positions> <data1>World</data1> <data2>P2</data2> </positions> </stock> <stock> <field1>Header</field1> <field2>H2</field2> <positions> <data1>Hello</data1> <data2>P3</data2> </positions> <positions> <data1>World</data1> <data2>P4</data2> </positions> </stock></stocks>";
def xml = new XmlSlurper().parseText( xmlString )
List<List<String>> output = []
xml.stock.each { stock ->
   output << [stock.field1.text(),stock.field2.text()]
   stock.positions.each { position ->
      output << [ position.data1.text(), position.data2.text() ]
   }
}
println( output.collect { it.join("|") }.join("\n") )
英文:
It's not super efficient because it reads everything into memory, but this will work:
String xmlString = """<stocks><stock>             <field1>Header</field1>                 <field2>H1</field2>         <positions>               <data1>Hello</data1>               <data2>P1</data2>         </positions>         <positions>               <data1>World</data1>               <data2>P2</data2>         </positions>     </stock>     <stock>             <field1>Header</field1>                 <field2>H2</field2>         <positions>               <data1>Hello</data1>               <data2>P3</data2>         </positions>         <positions>               <data1>World</data1>               <data2>P4</data2>         </positions>     </stock></stocks>"""
def xml = new XmlSlurper().parseText( xmlString )
List<List<String>> output = []
xml.stock.each { stock ->
   output << [stock.field1.text(),stock.field2.text()]
   stock.positions.each { position ->
      output << [ position.data1.text(), position.data2.text() ]
   }
}
println( output.collect { it.join("|") }.join("\n") )
答案2
得分: 0
以下是代码的中文翻译:
你可以遍历XML树并收集具有匹配名称的节点:
def ins = '''
<root>
    <stock>
        <field1>Header</field1>
        <field2>H1</field2>
        <positions>
            <data1>Hello</data1>
            <data2>P1</data2>
        </positions>
        <positions>
            <data1>World</data1>
            <data2>P2</data2>
        </positions>
    </stock>
    <stock>
        <field1>Header</field1>
        <field2>H2</field2>
        <positions>
            <data1>Hello</data1>
            <data2>P3</data2>
        </positions>
        <positions>
            <data1>World</data1>
            <data2>P4</data2>
        </positions>
    </stock>
</root>
'''
def content = new XmlSlurper().parseText(ins)
def csv = content.depthFirst()
    .findAll { it.name().matches(/(field|data).*/) }
    .collect{ it.text() }.collate(2)
    .collect { it.join("|") }.join("\n")
或者,如果你需要一种相对通用的方法,可以扩展字段和数据标签的数量:
def csv = content.depthFirst()
    .findAll { it.name() in ["stock", "positions"] }
    .collect { node -> node.children()
                .findAll{ it.name() != "positions" }
                .collect{ it.text() } }
    .collect { it.join("|") }.join("\n")
注意:这些是给定代码片段的中文翻译,不包括代码的执行或其他信息。
英文:
You could traverse the xml-tree and collect the nodes with matching names:
def ins = '''
<root>
	<stock>
		<field1>Header</field1>
		<field2>H1</field2>
		<positions>
			<data1>Hello</data1>
			<data2>P1</data2>
		</positions>
		<positions>
			<data1>World</data1>
			<data2>P2</data2>
		</positions>
	</stock>
	<stock>
		<field1>Header</field1>
		<field2>H2</field2>
		<positions>
			<data1>Hello</data1>
			<data2>P3</data2>
		</positions>
		<positions>
			<data1>World</data1>
			<data2>P4</data2>
		</positions>
	</stock>
</root>
'''
def content = new XmlSlurper().parseText(ins)
def csv = content.depthFirst()
        .findAll { it.name().matches(/(field|data).*/) }
        .collect{ it.text() }.collate(2)
        .collect { it.join("|") }.join("\n")
Or if you need a somewhat generic approach where you can extend the number of field and data tags:
def csv = content.depthFirst()
        .findAll { it.name() in ["stock", "positions"] }
        .collect { node -> node.children()
                    .findAll{ it.name() != "positions" }
                    .collect{ it.text() } }
        .collect { it.join("|") }.join("\n")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论