英文:
how to get the xml tag name after parsing as it is specified in xml without namespace conversion
问题
我需要将 XML 解析成另一种结构。
示例:
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""
我正在使用 ElementTree 来解析这个树:
root = ElementTree.fromstring(a)
当我执行以下操作时:
root[0][1].tag
我得到了这个结果:
{http://characters.example.com}character
但我需要获得与原始文件中相同的结果:
fictional:character
我该如何获得这个结果?
英文:
I need to parse xml into another structure.
example:
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""
I am using ElementTree to parse the tree
root = ElementTree.fromstring(a)
When I apply
root[0][1].tag
I get the result
{``http://characters.example.com``}character
but I need to get the result as it was in the original file
fictional:character
how do I achieve this result?
答案1
得分: 1
使用XPath,您可以使用name()
(以及没有前缀的local-name()
)返回元素的命名空间前缀和本地名称。Python的第三方包lxml
可以运行XPath 1.0:
import lxml.etree as lx
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""
root = xl.fromstring(a)
for el in root.xpath("/actor/*"):
print(el.xpath("name()"))
# name
# fictional:character
# fictional:character
# fictional:character
英文:
With XPath, you can return namespace prefixes with local name of an element using name()
(and without prefix: local-name()
). Python's third-party package, lxml
, can run XPath 1.0:
import lxml.etree as lx
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""
root = xl.fromstring(a)
for el in root.xpath("/actor/*"):
print(el.xpath("name()"))
# name
# fictional:character
# fictional:character
# fictional:character
答案2
得分: 0
使用ElementTree库,没有简单的方法来做到这一点。
英文:
with ElementTree library there is no simple way to do it.
答案3
得分: 0
你可以使用re.sub()函数:
import xml.etree.ElementTree as ET
import re
from io import StringIO
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""
f = StringIO(a)
tree = ET.parse(f)
root = tree.getroot()
ns={"fictional": "http://characters.example.com"}
for elem in root.findall(".//fictional:character", ns):
print(re.sub("{http://characters.example.com}", "fictional:", elem.tag), elem.text)
输出结果:
fictional:character Sir Robin
fictional:character Gunther
fictional:character Commander Clement
英文:
You can use re.sub():
import xml.etree.ElementTree as ET
import re
from io import StringIO
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""
f = StringIO(a)
tree = ET.parse(f)
root = tree.getroot()
ns={"fictional": "http://characters.example.com"}
for elem in root.findall(".//fictional:character", ns):
print(re.sub("{http://characters.example.com}", "fictional:", elem.tag), elem.text)
Output:
fictional:character Sir Robin
fictional:character Gunther
fictional:character Commander Clement
答案4
得分: 0
我发现expat解析器
参与了命名空间的转换。它是由解析器创建的,默认情况下被ElementTree
使用。
xml.etree.ElementTree.XMLParser
在初始化方法中使用以下命令创建:
parser = expat.ParserCreate(encoding, ""}"")
如果你将这行重定义为以下内容,你可以覆盖解析器的标准行为:
parser = expat.ParserCreate(encoding, None)
在这种情况下,命名空间处理被禁用。
英文:
I found out that the expat parser
is engaged in the transformation of namespaces.
It is created by the parser, which is used by default ElementTree
.
xml.etree.ElementTree.XMLParser
is created in the initialization method with the command
parser = expat.ParserCreate(encoding, "}")
You can override the standard behavior of the parser if you redefine this line to
parser = expat.ParserCreate(encoding, None)
In this case, namespace processing is disabled
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论