从CSV中提取2行并转换为XML。

huangapple go评论70阅读模式
英文:

pick up 2 rows from csv and convert to xml

问题

以下是代码的翻译部分:

import csv
import xml.etree.ElementTree as ET

# 定义行的名称
row_names = [
    'Time',
    'Client',
    'User',
    'number',
    'processid',
    'program',
    'randomnumber',
    'processidandwp',
    'userclient',
    'transactionid',
    'additional1',
    'additional2',
    'additional3',
    'additional4'
]

# 创建XML根元素
root = ET.Element("Processes")
counter = 0

# 打开CSV文件
with open("data.csv", 'r') as file:
    csv_reader = csv.reader(file, delimiter="|")
    sub_root = ET.SubElement(root, 'name')
    for row in csv_reader:
        for name in row:
            if counter < len(row_names) and name:
                ele = ET.SubElement(sub_root, row_names
0
+
网站访问量
)
ele.text = name counter += 1 # 打印生成的XML ET.dump(root)

这段代码是用于读取CSV文件并生成XML的,可以帮助你将数据转换为期望的XML格式。你需要确保CSV文件中的数据与期望的XML结构相匹配,以便生成正确的输出。

英文:

My text file has 100's of entries like below.. I want my code to catch each event which has 14 or 15 elements seperated by delimiter ( | ) and put them in xml. Each event should be captured in new <name> tag.

6354|,EGZ|2023012711283700|900|DDIC|S000|R_JR_BTCJOBS_GENERATOR||1|25737,00088,B5|SAP_WORKFLOW_WIM_ACTION/11283700&JOB_CLOSE&&&&|43AE5E5C16990580E0063BBEAE21BEA8|42010A2A25FA1EDDA7CN
BDA81EE66224C|0000000000000000000000000000000000000\000000000000000000
6355|,EGZ|2023012711283700|900|DDIC|S000|R_JR_BTCJOBS_GENERATOR||1|25737,00088,B5|SAP_WORKFLOW_WIM_ACTION/11283700&JOB_CLOSE&&&&|43AE5E5C16990580E0063BBEAE21BEA8|42010A2A25FA1EDDA7CN
BDA81EE66224C|0000000000000000000000000000000000000\000000000000000000s

Expected output is this:
&lt;/Processes&gt;
 &lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;
  &lt;name&gt;
   &lt;Time&gt;6354&lt;/Time&gt;
   &lt;Client&gt;,EGZ&lt;/Client&gt;
   &lt;User&gt;2023012711283700&lt;/User&gt;
   &lt;number&gt;900&lt;/number&gt;
   &lt;processid&gt;DDIC&lt;/processid&gt;
   &lt;program&gt;S000&lt;/program&gt;
   &lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
   &lt;processidandwp&gt;&lt;/processidandwp&gt;
   &lt;userclient&gt;1&lt;/userclient&gt;
   &lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
   &lt;additional1&gt;text&lt;/additional1&gt;
   &lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
   &lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
   &lt;additional4&gt;BDA81EE66224C&lt;/additional4&gt;
   &lt;additional5&gt;000000000000000000/00000000000&lt;/additional5&gt;
  &lt;/name&gt;
  &lt;name&gt;
   &lt;Time&gt;6355&lt;/Time&gt;
   &lt;Client&gt;,EGZ&lt;/Client&gt;
   &lt;User&gt;2023012711283700&lt;/User&gt;
   &lt;number&gt;900&lt;/number&gt;
   &lt;processid&gt;DDIC&lt;/processid&gt;
   &lt;program&gt;S000&lt;/program&gt;
   &lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
   &lt;processidandwp&gt;&lt;/processidandwp&gt;
   &lt;userclient&gt;1&lt;/userclient&gt;
   &lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
   &lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
   &lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
   &lt;additional4&gt;BDA81EE66224C&lt;/additional4&gt;
   &lt;additional5&gt;000000000000000000/00000000000&lt;/additional5&gt;
  &lt;/name&gt;
 &lt;/Processes&gt;

The current output that I get is this
 &lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;
 &lt;Processes&gt;
  &lt;name&gt;
  &lt;Time&gt;6354&lt;/Time&gt;
  &lt;Client&gt;,EGZ&lt;/Client&gt;
  &lt;User&gt;2023012711283700&lt;/User&gt;
  &lt;number&gt;900&lt;/number&gt;
  &lt;processid&gt;DDIC&lt;/processid&gt;
  &lt;program&gt;S000&lt;/program&gt;
  &lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
  &lt;processidandwp&gt;&lt;/processidandwp&gt;
  &lt;userclient&gt;1&lt;/userclient&gt;
  &lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
  &lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/&lt;/additional1&gt;
  &lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
  &lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
 &lt;/name&gt;
 &lt;name&gt;
  &lt;Time&gt;BDA81EE66224C&lt;/Time&gt;
  &lt;Client&gt;0000000000000000000000000000000000000
Expected output is this:
&lt;/Processes&gt;
&lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;
&lt;name&gt;
&lt;Time&gt;6354&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;text&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;additional4&gt;BDA81EE66224C&lt;/additional4&gt;
&lt;additional5&gt;000000000000000000/00000000000&lt;/additional5&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;6355&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;additional4&gt;BDA81EE66224C&lt;/additional4&gt;
&lt;additional5&gt;000000000000000000/00000000000&lt;/additional5&gt;
&lt;/name&gt;
&lt;/Processes&gt;
The current output that I get is this
&lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;
&lt;Processes&gt;
&lt;name&gt;
&lt;Time&gt;6354&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;BDA81EE66224C&lt;/Time&gt;
&lt;Client&gt;0000000000000000000000000000000000000\000000000000000000&lt;/Client&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;6355&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;BDA81EE66224C&lt;/Time&gt;
&lt;Client&gt;0000000000000000000000000000000000000\000000000000000000s&lt;/Client&gt;
&lt;/name&gt;
&lt;/Processes&gt;
My code which i got is this:
import csv
import xml.etree.ElementTree as ET
row_names = [
&#39;Time&#39;,
&#39;Client&#39;,
&#39;User&#39;,
&#39;number&#39;,
&#39;processid&#39;,
&#39;program&#39;,
&#39;randomnumber&#39;,
&#39;processidandwp&#39;,
&#39;userclient&#39;,
&#39;transactionid&#39;,
&#39;additional1&#39;,
&#39;additional2&#39;,
&#39;additional3&#39;,
&#39;additional4&#39;
]
root = ET.Element(&quot;Processes&quot;)
counter = 0
with open(&quot;data.csv&quot;, &#39;r&#39;) as file:
csv_reader = csv.reader(file, delimiter=&quot;|&quot;)
sub_root = ET.SubElement(root, &#39;name&#39;)
for row in csv_reader:
for name in row:
if counter &lt; len(row_names) and name:
ele = ET.SubElement(sub_root, row_names
0
+
网站访问量
) ele.text = name counter += 1 ET.dump(root)
0000000000000000&lt;/Client&gt; &lt;/name&gt; &lt;name&gt; &lt;Time&gt;6355&lt;/Time&gt; &lt;Client&gt;,EGZ&lt;/Client&gt; &lt;User&gt;2023012711283700&lt;/User&gt; &lt;number&gt;900&lt;/number&gt; &lt;processid&gt;DDIC&lt;/processid&gt; &lt;program&gt;S000&lt;/program&gt; &lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt; &lt;processidandwp&gt;&lt;/processidandwp&gt; &lt;userclient&gt;1&lt;/userclient&gt; &lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt; &lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11&lt;/additional1&gt; &lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt; &lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt; &lt;/name&gt; &lt;name&gt; &lt;Time&gt;BDA81EE66224C&lt;/Time&gt; &lt;Client&gt;0000000000000000000000000000000000000
Expected output is this:
&lt;/Processes&gt;
&lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;
&lt;name&gt;
&lt;Time&gt;6354&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;text&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;additional4&gt;BDA81EE66224C&lt;/additional4&gt;
&lt;additional5&gt;000000000000000000/00000000000&lt;/additional5&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;6355&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;additional4&gt;BDA81EE66224C&lt;/additional4&gt;
&lt;additional5&gt;000000000000000000/00000000000&lt;/additional5&gt;
&lt;/name&gt;
&lt;/Processes&gt;
The current output that I get is this
&lt;?xml version=&#39;1.0&#39; encoding=&#39;utf-8&#39;?&gt;
&lt;Processes&gt;
&lt;name&gt;
&lt;Time&gt;6354&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;BDA81EE66224C&lt;/Time&gt;
&lt;Client&gt;0000000000000000000000000000000000000\000000000000000000&lt;/Client&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;6355&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp&gt;&lt;/processidandwp&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CN&lt;/additional3&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;BDA81EE66224C&lt;/Time&gt;
&lt;Client&gt;0000000000000000000000000000000000000\000000000000000000s&lt;/Client&gt;
&lt;/name&gt;
&lt;/Processes&gt;
My code which i got is this:
import csv
import xml.etree.ElementTree as ET
row_names = [
&#39;Time&#39;,
&#39;Client&#39;,
&#39;User&#39;,
&#39;number&#39;,
&#39;processid&#39;,
&#39;program&#39;,
&#39;randomnumber&#39;,
&#39;processidandwp&#39;,
&#39;userclient&#39;,
&#39;transactionid&#39;,
&#39;additional1&#39;,
&#39;additional2&#39;,
&#39;additional3&#39;,
&#39;additional4&#39;
]
root = ET.Element(&quot;Processes&quot;)
counter = 0
with open(&quot;data.csv&quot;, &#39;r&#39;) as file:
csv_reader = csv.reader(file, delimiter=&quot;|&quot;)
sub_root = ET.SubElement(root, &#39;name&#39;)
for row in csv_reader:
for name in row:
if counter &lt; len(row_names) and name:
ele = ET.SubElement(sub_root, row_names
0
+
网站访问量
) ele.text = name counter += 1 ET.dump(root)
0000000000000000s&lt;/Client&gt; &lt;/name&gt; &lt;/Processes&gt; My code which i got is this: import csv import xml.etree.ElementTree as ET row_names = [ &#39;Time&#39;, &#39;Client&#39;, &#39;User&#39;, &#39;number&#39;, &#39;processid&#39;, &#39;program&#39;, &#39;randomnumber&#39;, &#39;processidandwp&#39;, &#39;userclient&#39;, &#39;transactionid&#39;, &#39;additional1&#39;, &#39;additional2&#39;, &#39;additional3&#39;, &#39;additional4&#39; ] root = ET.Element(&quot;Processes&quot;) counter = 0 with open(&quot;data.csv&quot;, &#39;r&#39;) as file: csv_reader = csv.reader(file, delimiter=&quot;|&quot;) sub_root = ET.SubElement(root, &#39;name&#39;) for row in csv_reader: for name in row: if counter &lt; len(row_names) and name: ele = ET.SubElement(sub_root, row_names
0
+
网站访问量
) ele.text = name counter += 1 ET.dump(root)

If you see my current output vs expected output, I want to have the expected output. For now...when the code reads the rows from the file, as soon as it reaches the 2nd row ( for the 1st event) or 4th row ( for the 2nd event) , it creates a new <name> tag. Does it make sense?

答案1

得分: 1

假设在 data.csv 文件中有偶数行。以下重构后的 Python 代码可能适用于您。将每两行组合成一个以管道分隔的记录,然后将其拆分为数组。对于数组中的每个项目,使用row_names 数组中相应的节点名称构建 XML 节点。

import xml.etree.ElementTree as ET
import itertools

node_names = ['Time', 'Client', 'User', 'number', 'processid',
'program', 'randomnumber', 'processidandwp', 'userclient', 'transactionid',
'additional1', 'additional2', 'additional3', 'additional4', 'additional5']

root = ET.Element('Processes')
with open('data.csv') as f:
    for l1, l2 in itertools.zip_longest(*[f]*2):
        sub_root = ET.SubElement(root, 'name')
        for idx, item in "".join([l1.strip(), l2.strip()]).split("|"):
            ele = ET.SubElement(sub_root, node_names[idx])
            ele.text = item

ET.indent(root, space="  ", level=0)
ET.dump(root)

输出:

<Processes>
  <name>
    <Time>6354</Time>
    <Client>,EGZ</Client>
    <User>2023012711283700</User>
    <number>900</number>
    <processid>DDIC</processid>
    <program>S000</program>
    <randomnumber>R_JR_BTCJOBS_GENERATOR</randomnumber>
    <processidandwp />
    <userclient>1</userclient>
    <transactionid>25737,00088,B5</transactionid>
    <additional1>SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1>
    <additional2>43AE5E5C16990580E0063BBEAE21BEA8</additional2>
    <additional3>42010A2A25FA1EDDA7CNBDA81EE66224C</additional3>
    <additional4>0000000000000000000000000000000000000
<Processes>
  <name>
    <Time>6354</Time>
    <Client>,EGZ</Client>
    <User>2023012711283700</User>
    <number>900</number>
    <processid>DDIC</processid>
    <program>S000</program>
    <randomnumber>R_JR_BTCJOBS_GENERATOR</randomnumber>
    <processidandwp />
    <userclient>1</userclient>
    <transactionid>25737,00088,B5</transactionid>
    <additional1>SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1>
    <additional2>43AE5E5C16990580E0063BBEAE21BEA8</additional2>
    <additional3>42010A2A25FA1EDDA7CNBDA81EE66224C</additional3>
    <additional4>0000000000000000000000000000000000000\000000000000000000</additional4>
  </name>
  <name>
    <Time>6355</Time>
    <Client>,EGZ</Client>
    <User>2023012711283700</User>
    <number>900</number>
    <processid>DDIC</processid>
    <program>S000</program>
    <randomnumber>R_JR_BTCJOBS_GENERATOR</randomnumber>
    <processidandwp />
    <userclient>1</userclient>
    <transactionid>25737,00088,B5</transactionid>
    <additional1>SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1>
    <additional2>43AE5E5C16990580E0063BBEAE21BEA8</additional2>
    <additional3>42010A2A25FA1EDDA7CNBDA81EE66224C</additional3>
    <additional4>0000000000000000000000000000000000000\000000000000000000s</additional4>
  </name>
</Processes>
0000000000000000</additional4>
</name> <name> <Time>6355</Time> <Client>,EGZ</Client> <User>2023012711283700</User> <number>900</number> <processid>DDIC</processid> <program>S000</program> <randomnumber>R_JR_BTCJOBS_GENERATOR</randomnumber> <processidandwp /> <userclient>1</userclient> <transactionid>25737,00088,B5</transactionid> <additional1>SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1> <additional2>43AE5E5C16990580E0063BBEAE21BEA8</additional2> <additional3>42010A2A25FA1EDDA7CNBDA81EE66224C</additional3> <additional4>0000000000000000000000000000000000000
<Processes>
  <name>
    <Time>6354</Time>
    <Client>,EGZ</Client>
    <User>2023012711283700</User>
    <number>900</number>
    <processid>DDIC</processid>
    <program>S000</program>
    <randomnumber>R_JR_BTCJOBS_GENERATOR</randomnumber>
    <processidandwp />
    <userclient>1</userclient>
    <transactionid>25737,00088,B5</transactionid>
    <additional1>SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1>
    <additional2>43AE5E5C16990580E0063BBEAE21BEA8</additional2>
    <additional3>42010A2A25FA1EDDA7CNBDA81EE66224C</additional3>
    <additional4>0000000000000000000000000000000000000\000000000000000000</additional4>
  </name>
  <name>
    <Time>6355</Time>
    <Client>,EGZ</Client>
    <User>2023012711283700</User>
    <number>900</number>
    <processid>DDIC</processid>
    <program>S000</program>
    <randomnumber>R_JR_BTCJOBS_GENERATOR</randomnumber>
    <processidandwp />
    <userclient>1</userclient>
    <transactionid>25737,00088,B5</transactionid>
    <additional1>SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1>
    <additional2>43AE5E5C16990580E0063BBEAE21BEA8</additional2>
    <additional3>42010A2A25FA1EDDA7CNBDA81EE66224C</additional3>
    <additional4>0000000000000000000000000000000000000\000000000000000000s</additional4>
  </name>
</Processes>
0000000000000000s</additional4>
</name> </Processes>

验证源文件 data.csv 包含 4 行:

wc -l data.csv
       4 data.csv
英文:

Assuming an even number of lines in data.csv. The following refactored python code may work for you. Combine every 2 lines into a single pipe-delimited record that is split into an array. For each item in the array build an XML node using the corresponding node name from the row_names array.

import xml.etree.ElementTree as ET
import itertools

node_names = [&#39;Time&#39;,&#39;Client&#39;,&#39;User&#39;,&#39;number&#39;,&#39;processid&#39;,
&#39;program&#39;,&#39;randomnumber&#39;,&#39;processidandwp&#39;,&#39;userclient&#39;,&#39;transactionid&#39;,
&#39;additional1&#39;,&#39;additional2&#39;,&#39;additional3&#39;,&#39;additional4&#39;,&#39;additional5&#39;]

root = ET.Element(&#39;Processes&#39;)
with open(&#39;data.csv&#39;) as f:
    for l1,l2 in itertools.zip_longest(*[f]*2):
        sub_root = ET.SubElement(root, &#39;name&#39;)
        for idx, item in enumerate(&quot;&quot;.join([l1.strip(), l2.strip()]).split(&quot;|&quot;)):
            ele = ET.SubElement(sub_root, node_names[idx])
            ele.text = item

ET.indent(root, space=&quot;  &quot;, level=0)
ET.dump(root)

Output:

&lt;Processes&gt;
  &lt;name&gt;
    &lt;Time&gt;6354&lt;/Time&gt;
    &lt;Client&gt;,EGZ&lt;/Client&gt;
    &lt;User&gt;2023012711283700&lt;/User&gt;
    &lt;number&gt;900&lt;/number&gt;
    &lt;processid&gt;DDIC&lt;/processid&gt;
    &lt;program&gt;S000&lt;/program&gt;
    &lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
    &lt;processidandwp /&gt;
    &lt;userclient&gt;1&lt;/userclient&gt;
    &lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
    &lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1&gt;
    &lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
    &lt;additional3&gt;42010A2A25FA1EDDA7CNBDA81EE66224C&lt;/additional3&gt;
    &lt;additional4&gt;0000000000000000000000000000000000000
&lt;Processes&gt;
&lt;name&gt;
&lt;Time&gt;6354&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp /&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CNBDA81EE66224C&lt;/additional3&gt;
&lt;additional4&gt;0000000000000000000000000000000000000\000000000000000000&lt;/additional4&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;6355&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp /&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CNBDA81EE66224C&lt;/additional3&gt;
&lt;additional4&gt;0000000000000000000000000000000000000\000000000000000000s&lt;/additional4&gt;
&lt;/name&gt;
&lt;/Processes&gt;
0000000000000000&lt;/additional4&gt; &lt;/name&gt; &lt;name&gt; &lt;Time&gt;6355&lt;/Time&gt; &lt;Client&gt;,EGZ&lt;/Client&gt; &lt;User&gt;2023012711283700&lt;/User&gt; &lt;number&gt;900&lt;/number&gt; &lt;processid&gt;DDIC&lt;/processid&gt; &lt;program&gt;S000&lt;/program&gt; &lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt; &lt;processidandwp /&gt; &lt;userclient&gt;1&lt;/userclient&gt; &lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt; &lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1&gt; &lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt; &lt;additional3&gt;42010A2A25FA1EDDA7CNBDA81EE66224C&lt;/additional3&gt; &lt;additional4&gt;0000000000000000000000000000000000000
&lt;Processes&gt;
&lt;name&gt;
&lt;Time&gt;6354&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp /&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CNBDA81EE66224C&lt;/additional3&gt;
&lt;additional4&gt;0000000000000000000000000000000000000\000000000000000000&lt;/additional4&gt;
&lt;/name&gt;
&lt;name&gt;
&lt;Time&gt;6355&lt;/Time&gt;
&lt;Client&gt;,EGZ&lt;/Client&gt;
&lt;User&gt;2023012711283700&lt;/User&gt;
&lt;number&gt;900&lt;/number&gt;
&lt;processid&gt;DDIC&lt;/processid&gt;
&lt;program&gt;S000&lt;/program&gt;
&lt;randomnumber&gt;R_JR_BTCJOBS_GENERATOR&lt;/randomnumber&gt;
&lt;processidandwp /&gt;
&lt;userclient&gt;1&lt;/userclient&gt;
&lt;transactionid&gt;25737,00088,B5&lt;/transactionid&gt;
&lt;additional1&gt;SAP_WORKFLOW_WIM_ACTION/11283700&amp;amp;JOB_CLOSE&amp;amp;&amp;amp;&amp;amp;&amp;amp;&lt;/additional1&gt;
&lt;additional2&gt;43AE5E5C16990580E0063BBEAE21BEA8&lt;/additional2&gt;
&lt;additional3&gt;42010A2A25FA1EDDA7CNBDA81EE66224C&lt;/additional3&gt;
&lt;additional4&gt;0000000000000000000000000000000000000\000000000000000000s&lt;/additional4&gt;
&lt;/name&gt;
&lt;/Processes&gt;
0000000000000000s&lt;/additional4&gt; &lt;/name&gt; &lt;/Processes&gt;

Verification that source data.csv has 4 lines:

wc -l data.csv 
       4 data.csv

huangapple
  • 本文由 发表于 2023年2月8日 22:21:45
  • 转载请务必保留本文链接:https://go.coder-hub.com/75387137.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定