2023年7月13日 19:20:38go评论219阅读模式

英文:

how to get the xml tag name after parsing as it is specified in xml without namespace conversion

问题

我需要将 XML 解析成另一种结构。
示例：

a = &quot;&quot;&quot;
&lt;actors xmlns:fictional=&quot;http://characters.example.com&quot;&gt;
  &lt;actor&gt;     
    &lt;name&gt;Eric Idle&lt;/name&gt;
     &lt;fictional:character&gt;Sir Robin&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Gunther&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Commander Clement&lt;/fictional:character&gt;
   &lt;/actor&gt;
&lt;/actors&gt;
&quot;&quot;&quot;

我正在使用 ElementTree 来解析这个树：

root = ElementTree.fromstring(a)

当我执行以下操作时：

root[0][1].tag

我得到了这个结果：

{http://characters.example.com}character

但我需要获得与原始文件中相同的结果：

fictional:character

我该如何获得这个结果？

英文:

I need to parse xml into another structure.
example:
a = """
<actors xmlns:fictional="http://characters.example.com">
<actor>
<name>Eric Idle</name>
<fictional:character>Sir Robin</fictional:character>
<fictional:character>Gunther</fictional:character>
<fictional:character>Commander Clement</fictional:character>
</actor>
</actors>
"""

I am using ElementTree to parse the tree
root = ElementTree.fromstring(a)

When I apply
root[0][1].tag

I get the result
{``http://characters.example.com``}character

but I need to get the result as it was in the original file
fictional:character

how do I achieve this result?

答案1

得分: 1

使用XPath，您可以使用name()（以及没有前缀的local-name()）返回元素的命名空间前缀和本地名称。Python的第三方包lxml可以运行XPath 1.0：

import lxml.etree as lx

a = &quot;&quot;&quot;
&lt;actors xmlns:fictional=&quot;http://characters.example.com&quot;&gt;
 &lt;actor&gt;    
    &lt;name&gt;Eric Idle&lt;/name&gt;
     &lt;fictional:character&gt;Sir Robin&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Gunther&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Commander Clement&lt;/fictional:character&gt;
   &lt;/actor&gt;
&lt;/actors&gt;
&quot;&quot;&quot;

root = xl.fromstring(a)

for el in root.xpath(&quot;/actor/*&quot;):
   print(el.xpath(&quot;name()&quot;))

# name
# fictional:character
# fictional:character
# fictional:character

英文:

With XPath, you can return namespace prefixes with local name of an element using name() (and without prefix: local-name()). Python's third-party package, lxml, can run XPath 1.0:

import lxml.etree as lx

a = &quot;&quot;&quot;
&lt;actors xmlns:fictional=&quot;http://characters.example.com&quot;&gt;
 &lt;actor&gt;    
    &lt;name&gt;Eric Idle&lt;/name&gt;
     &lt;fictional:character&gt;Sir Robin&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Gunther&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Commander Clement&lt;/fictional:character&gt;
   &lt;/actor&gt;
&lt;/actors&gt;
&quot;&quot;&quot;

root = xl.fromstring(a)

for el in root.xpath(&quot;/actor/*&quot;):
   print(el.xpath(&quot;name()&quot;))

# name
# fictional:character
# fictional:character
# fictional:character

答案2

得分: 0

使用ElementTree库，没有简单的方法来做到这一点。

英文:

with ElementTree library there is no simple way to do it.

答案3

得分: 0

你可以使用re.sub()函数：

import xml.etree.ElementTree as ET
import re
from io import StringIO

a = """
<actors xmlns:fictional="http://characters.example.com">
 <actor>    
    <name>Eric Idle</name>
     <fictional:character>Sir Robin</fictional:character>
     <fictional:character>Gunther</fictional:character>
     <fictional:character>Commander Clement</fictional:character>
   </actor>
</actors>
"""
f = StringIO(a)

tree = ET.parse(f)
root = tree.getroot()

ns={"fictional": "http://characters.example.com"}

for elem in root.findall(".//fictional:character", ns):
    print(re.sub("{http://characters.example.com}", "fictional:", elem.tag), elem.text)

输出结果：

fictional:character Sir Robin
fictional:character Gunther
fictional:character Commander Clement

英文:

You can use re.sub():

import xml.etree.ElementTree as ET
import re
from io import StringIO

a = &quot;&quot;&quot;
&lt;actors xmlns:fictional=&quot;http://characters.example.com&quot;&gt;
 &lt;actor&gt;    
    &lt;name&gt;Eric Idle&lt;/name&gt;
     &lt;fictional:character&gt;Sir Robin&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Gunther&lt;/fictional:character&gt;
     &lt;fictional:character&gt;Commander Clement&lt;/fictional:character&gt;
   &lt;/actor&gt;
&lt;/actors&gt;
&quot;&quot;&quot;
f = StringIO(a)

tree = ET.parse(f)
root = tree.getroot()

ns={&quot;fictional&quot;: &quot;http://characters.example.com&quot;}

for elem in root.findall(&quot;.//fictional:character&quot;, ns):
    print(re.sub(&quot;{http://characters.example.com}&quot;, &quot;fictional:&quot;, elem.tag), elem.text)

Output:

fictional:character Sir Robin
fictional:character Gunther
fictional:character Commander Clement

答案4

得分: 0

我发现expat解析器参与了命名空间的转换。它是由解析器创建的，默认情况下被ElementTree使用。

xml.etree.ElementTree.XMLParser

在初始化方法中使用以下命令创建：

parser = expat.ParserCreate(encoding, "&quot;}&quot;")

如果你将这行重定义为以下内容，你可以覆盖解析器的标准行为：

parser = expat.ParserCreate(encoding, None)

在这种情况下，命名空间处理被禁用。

英文:

I found out that the expat parser is engaged in the transformation of namespaces.
It is created by the parser, which is used by default ElementTree.

xml.etree.ElementTree.XMLParser

is created in the initialization method with the command

parser = expat.ParserCreate(encoding, &quot;}&quot;)

You can override the standard behavior of the parser if you redefine this line to

parser = expat.ParserCreate(encoding, None)

In this case, namespace processing is disabled

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

how to get the xml tag name after parsing as it is specified in xml without namespace conversion

问题

答案1

答案2

答案3

答案4

Golang的xml编码不会将空值映射为nil。

NumPy在Cython中：编译时类型与原始类型

有没有推荐的算法，用于压缩字符串中类似DNA的多个特定子串？

Selecting Item from One table and Iterate in another table to see if It exists and Add a column Label

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论