2023年6月26日 12:40:21go评论182阅读模式

英文:

How to parse xml that include xml schema info using python

问题

import xml.etree.ElementTree as ET
mytree=ET.parse("/Users/user/student.xml")
myroot=mytree.getroot()
tag=myroot.tag
print(tag)
#attr=myroot.attrib
#print(attr)

for p in myroot.findall('.//studentData'):
    acctDt=p.find('acctDt').text

英文:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;studentData xmlns=&quot;http://www.myschool.com/schmea/studentData&quot; xmlns:xsi=&quot;http://www.w3.org/2000/10/XMLSchema-instance&quot; xsi:schemaLocation=&quot;http://www.myschool.com/schmea/studentData Studentdata.xsd&quot;&gt;
   &lt;stuRec&gt;
      &lt;as&gt;
         &lt;sourceSys&gt;BBC&lt;/sourceSys&gt;
         &lt;acctDt&gt;2023-04-04&lt;/acctDt&gt;
      &lt;/as&gt;
      &lt;stats&gt;
         &lt;ss&gt;
            &lt;prov&gt;AB&lt;/prov&gt;
            &lt;cono&gt;1&lt;/cono&gt;
         &lt;/ss&gt;
      &lt;/stats&gt;
   &lt;/stuRec&gt;
   &lt;stuRec&gt;
      &lt;as&gt;
         &lt;sourceSys&gt;RCD&lt;/sourceSys&gt;
         &lt;acctDt&gt;2023-05-14&lt;/acctDt&gt;
      &lt;/as&gt;
      &lt;stats&gt;
         &lt;ss&gt;
            &lt;prov&gt;ON&lt;/prov&gt;
            &lt;cono&gt;2&lt;/cono&gt;
         &lt;/ss&gt;
      &lt;/stats&gt;
   &lt;/stuRec&gt;
&lt;/studentData&gt;

import xml.etree.ElementTree as ET
    mytree=ET.parse(&quot;/Users/user/student.xml&quot;)
    myroot=mytree.getroot()
    tag=myroot.tag
    print(tag)
    #attr=myroot.attrib
    #print(attr)

for p in myroot.findall(&#39;.//studentData&#39;):
    acctDt=p.find(&#39;acctDt&#39;).text

**My XML file (student.xml) looks like above xml file:
**When I run the python code I can print root tag and attribute but I get nothing from the loop, however, I want to get acctDt and prov:

user@star ~ % python -u &quot;/Users/user/student.py&quot;
{http://www.myschool.com/schmea/studentData}studentData
{&#39;{http://www.w3.org/2000/10/XMLSchema-instance}schemaLocation&#39;: &#39;http://www.myschool.com/schmea/studentData Studentdata.xsd&#39;}
user@star ~ %

答案1

得分: 2

你应该调整你的循环，因为你的 XML 包含一个命名空间。做类似以下的操作：

ns = {'': 'http://www.myschool.com/schmea/studentData'}
for node in myroot.findall('.//acctDt', ns):
    print(node.text)

参考使用命名空间解析 XML。

英文:

You should adjust your loop, because your xml contain a namespace. Do something like:

ns = {&#39;&#39;: &#39;http://www.myschool.com/schmea/studentData&#39;}
for node in myroot.findall(&#39;.//acctDt&#39;, ns):
    print(node.text)

Compare Parsing XML with Namespaces

答案2

得分: 0

希望这对你的解决方案起作用

from lxml import etree
tree = etree.parse('./xml_schema_info.xml')
root = tree.getroot()
ele_sets = set()
for ele in root.xpath('.//*'):
    ele_sets.add(ele.tag)
print(f'元素: \n{ele_sets}\n总计: {len(ele_sets)}')
acctDt = '{http://www.myschool.com/schmea/studentData}acctDt'
for ele in root.iter(acctDt):
    print(f'acctDt: {ele.text}')
prov = '{http://www.myschool.com/schmea/studentData}prov'
for ele in root.iter(prov):
    print(f'prov: {ele.text}')

英文:

I hope, this will work for your solution

from lxml import etree
tree = etree.parse(&#39;./xml_schema_info.xml&#39;)
root = tree.getroot()
ele_sets = set()
for ele in root.xpath(&#39;.//*&#39;):
    ele_sets.add(ele.tag)
print(f&#39;elements: \n{ele_sets}\nTotal: {len(ele_sets)}&#39;)
acctDt = &#39;{http://www.myschool.com/schmea/studentData}acctDt&#39;
for ele in root.iter(acctDt):
    print(f&#39;acctDt: {ele.text}&#39;)
prov = &#39;{http://www.myschool.com/schmea/studentData}prov&#39;
for ele in root.iter(prov):
    print(f&#39;prov: {ele.text}&#39;)

答案3

得分: 0

以下是您提供的代码的翻译部分：

import xml.etree.ElementTree as ET
from io import StringIO

xml_str = """<?xml version="1.0" encoding="UTF-8"?>
<studentData xmlns="http://www.myschool.com/schmea/studentData" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.myschool.com/schmea/studentData Studentdata.xsd">
   <stuRec>
      <as>
         <sourceSys>BBC</sourceSys>
         <acctDt>2023-04-04</acctDt>
      </as>
      <stats>
         <ss>
            <prov>AB</prov>
            <cono>1</cono>
         </ss>
      </stats>
   </stuRec>
   <stuRec>
      <as>
         <sourceSys>RCD</sourceSys>
         <acctDt>2023-05-14</acctDt>
      </as>
      <stats>
         <ss>
            <prov>ON</prov>
            <cono>2</cono>
         </ss>
      </stats>
   </stuRec>
</studentData>"""

f = StringIO(xml_str)

tree = ET.parse(f)
root = tree.getroot()

ns = {'': 'http://www.myschool.com/schmea/studentData'}

for strRec in root.findall('.//stuRec', ns):
    sourceSys = strRec.find('.//sourceSys', ns).text
    acctDt = strRec.find('.//acctDt', ns).text
    prov = strRec.find('.//prov', ns).text
    cono = strRec.find('.//cono', ns).text
    
    print(f"{sourceSys:<3}, {acctDt:>15}, {prov:>6}, {cono:>5}")

输出：

BBC,     2023-04-04,    AB,    1
RCD,     2023-05-14,    ON,    2

英文:

For you extended question:

import xml.etree.ElementTree as ET
from io import StringIO

xml_str=&quot;&quot;&quot;&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;studentData xmlns=&quot;http://www.myschool.com/schmea/studentData&quot; xmlns:xsi=&quot;http://www.w3.org/2000/10/XMLSchema-instance&quot; xsi:schemaLocation=&quot;http://www.myschool.com/schmea/studentData Studentdata.xsd&quot;&gt;
   &lt;stuRec&gt;
      &lt;as&gt;
         &lt;sourceSys&gt;BBC&lt;/sourceSys&gt;
         &lt;acctDt&gt;2023-04-04&lt;/acctDt&gt;
      &lt;/as&gt;
      &lt;stats&gt;
         &lt;ss&gt;
            &lt;prov&gt;AB&lt;/prov&gt;
            &lt;cono&gt;1&lt;/cono&gt;
         &lt;/ss&gt;
      &lt;/stats&gt;
   &lt;/stuRec&gt;
   &lt;stuRec&gt;
      &lt;as&gt;
         &lt;sourceSys&gt;RCD&lt;/sourceSys&gt;
         &lt;acctDt&gt;2023-05-14&lt;/acctDt&gt;
      &lt;/as&gt;
      &lt;stats&gt;
         &lt;ss&gt;
            &lt;prov&gt;ON&lt;/prov&gt;
            &lt;cono&gt;2&lt;/cono&gt;
         &lt;/ss&gt;
      &lt;/stats&gt;
   &lt;/stuRec&gt;
&lt;/studentData&gt;&quot;&quot;&quot;

f = StringIO(xml_str)

tree = ET.parse(f)
root = tree.getroot()

ns = {&#39;&#39;: &#39;http://www.myschool.com/schmea/studentData&#39;}

for strRec in root.findall(&#39;.//stuRec&#39;, ns):
    sourceSys = strRec.find(&#39;.//sourceSys&#39;, ns).text
    acctDt = strRec.find(&#39;.//acctDt&#39;, ns).text
    prov = strRec.find(&#39;.//prov&#39;, ns).text
    cono = strRec.find(&#39;.//cono&#39;, ns).text
    
    print(f&quot;{sourceSys:&lt;3},{acctDt:&gt;15},{prov:&gt;6},{cono:&gt;5}&quot;)

Output:

BBC,     2023-04-04,    AB,    1
RCD,     2023-05-14,    ON,    2

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用Python解析包含XML模式信息的XML

问题

答案1

答案2

答案3

spacy python package no longer runs

prefect-shell: 任务运行遇到异常：运行时错误：PID xxx 以返回代码 6 失败。

在Python中拆分后提取数值以创建一个新列，标记为是或否

Django表单中的KeyError异常

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论