英文:
How to parse xml that include xml schema info using python
问题
import xml.etree.ElementTree as ET
mytree=ET.parse("/Users/user/student.xml")
myroot=mytree.getroot()
tag=myroot.tag
print(tag)
#attr=myroot.attrib
#print(attr)
for p in myroot.findall('.//studentData'):
acctDt=p.find('acctDt').text
英文:
<?xml version="1.0" encoding="UTF-8"?>
<studentData xmlns="http://www.myschool.com/schmea/studentData" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.myschool.com/schmea/studentData Studentdata.xsd">
<stuRec>
<as>
<sourceSys>BBC</sourceSys>
<acctDt>2023-04-04</acctDt>
</as>
<stats>
<ss>
<prov>AB</prov>
<cono>1</cono>
</ss>
</stats>
</stuRec>
<stuRec>
<as>
<sourceSys>RCD</sourceSys>
<acctDt>2023-05-14</acctDt>
</as>
<stats>
<ss>
<prov>ON</prov>
<cono>2</cono>
</ss>
</stats>
</stuRec>
</studentData>
import xml.etree.ElementTree as ET
mytree=ET.parse("/Users/user/student.xml")
myroot=mytree.getroot()
tag=myroot.tag
print(tag)
#attr=myroot.attrib
#print(attr)
for p in myroot.findall('.//studentData'):
acctDt=p.find('acctDt').text
**My XML file (student.xml) looks like above xml file:
**When I run the python code I can print root tag and attribute but I get nothing from the loop, however, I want to get acctDt and prov:
user@star ~ % python -u "/Users/user/student.py"
{http://www.myschool.com/schmea/studentData}studentData
{'{http://www.w3.org/2000/10/XMLSchema-instance}schemaLocation': 'http://www.myschool.com/schmea/studentData Studentdata.xsd'}
user@star ~ %
答案1
得分: 2
你应该调整你的循环,因为你的 XML 包含一个命名空间。做类似以下的操作:
ns = {'': 'http://www.myschool.com/schmea/studentData'}
for node in myroot.findall('.//acctDt', ns):
print(node.text)
参考使用命名空间解析 XML。
英文:
You should adjust your loop, because your xml contain a namespace. Do something like:
ns = {'': 'http://www.myschool.com/schmea/studentData'}
for node in myroot.findall('.//acctDt', ns):
print(node.text)
Compare Parsing XML with Namespaces
答案2
得分: 0
希望这对你的解决方案起作用
from lxml import etree
tree = etree.parse('./xml_schema_info.xml')
root = tree.getroot()
ele_sets = set()
for ele in root.xpath('.//*'):
ele_sets.add(ele.tag)
print(f'元素: \n{ele_sets}\n总计: {len(ele_sets)}')
acctDt = '{http://www.myschool.com/schmea/studentData}acctDt'
for ele in root.iter(acctDt):
print(f'acctDt: {ele.text}')
prov = '{http://www.myschool.com/schmea/studentData}prov'
for ele in root.iter(prov):
print(f'prov: {ele.text}')
英文:
I hope, this will work for your solution
from lxml import etree
tree = etree.parse('./xml_schema_info.xml')
root = tree.getroot()
ele_sets = set()
for ele in root.xpath('.//*'):
ele_sets.add(ele.tag)
print(f'elements: \n{ele_sets}\nTotal: {len(ele_sets)}')
acctDt = '{http://www.myschool.com/schmea/studentData}acctDt'
for ele in root.iter(acctDt):
print(f'acctDt: {ele.text}')
prov = '{http://www.myschool.com/schmea/studentData}prov'
for ele in root.iter(prov):
print(f'prov: {ele.text}')
答案3
得分: 0
以下是您提供的代码的翻译部分:
import xml.etree.ElementTree as ET
from io import StringIO
xml_str = """<?xml version="1.0" encoding="UTF-8"?>
<studentData xmlns="http://www.myschool.com/schmea/studentData" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.myschool.com/schmea/studentData Studentdata.xsd">
<stuRec>
<as>
<sourceSys>BBC</sourceSys>
<acctDt>2023-04-04</acctDt>
</as>
<stats>
<ss>
<prov>AB</prov>
<cono>1</cono>
</ss>
</stats>
</stuRec>
<stuRec>
<as>
<sourceSys>RCD</sourceSys>
<acctDt>2023-05-14</acctDt>
</as>
<stats>
<ss>
<prov>ON</prov>
<cono>2</cono>
</ss>
</stats>
</stuRec>
</studentData>"""
f = StringIO(xml_str)
tree = ET.parse(f)
root = tree.getroot()
ns = {'': 'http://www.myschool.com/schmea/studentData'}
for strRec in root.findall('.//stuRec', ns):
sourceSys = strRec.find('.//sourceSys', ns).text
acctDt = strRec.find('.//acctDt', ns).text
prov = strRec.find('.//prov', ns).text
cono = strRec.find('.//cono', ns).text
print(f"{sourceSys:<3}, {acctDt:>15}, {prov:>6}, {cono:>5}")
输出:
BBC, 2023-04-04, AB, 1
RCD, 2023-05-14, ON, 2
英文:
For you extended question:
import xml.etree.ElementTree as ET
from io import StringIO
xml_str="""<?xml version="1.0" encoding="UTF-8"?>
<studentData xmlns="http://www.myschool.com/schmea/studentData" xmlns:xsi="http://www.w3.org/2000/10/XMLSchema-instance" xsi:schemaLocation="http://www.myschool.com/schmea/studentData Studentdata.xsd">
<stuRec>
<as>
<sourceSys>BBC</sourceSys>
<acctDt>2023-04-04</acctDt>
</as>
<stats>
<ss>
<prov>AB</prov>
<cono>1</cono>
</ss>
</stats>
</stuRec>
<stuRec>
<as>
<sourceSys>RCD</sourceSys>
<acctDt>2023-05-14</acctDt>
</as>
<stats>
<ss>
<prov>ON</prov>
<cono>2</cono>
</ss>
</stats>
</stuRec>
</studentData>"""
f = StringIO(xml_str)
tree = ET.parse(f)
root = tree.getroot()
ns = {'': 'http://www.myschool.com/schmea/studentData'}
for strRec in root.findall('.//stuRec', ns):
sourceSys = strRec.find('.//sourceSys', ns).text
acctDt = strRec.find('.//acctDt', ns).text
prov = strRec.find('.//prov', ns).text
cono = strRec.find('.//cono', ns).text
print(f"{sourceSys:<3},{acctDt:>15},{prov:>6},{cono:>5}")
Output:
BBC, 2023-04-04, AB, 1
RCD, 2023-05-14, ON, 2
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论