移除XML树的所有节点中相同的元素

huangapple go评论62阅读模式
英文:

Removing the same element across all the nodes of an XML tree

问题

以下是要翻译的内容:

如何编写代码以便我可以删除每个国家节点中的一个节点元素(即年份或描述)。例如,在以下代码中:

# 要删除
# for country in root.findall('country'):
    # year = int(country.find('year').text)
    # if year > 2010:
        # root.remove(country)
# tree.write('sample.xml')

我可以删除元素年份属性大于2010的任何国家节点。但这将删除整个节点,而不仅仅是年份元素。我知道可以使用以下代码删除节点的单个元素:

# for country in root.findall('country'):
    # description_node = country.find('description')
    # if description_node.text == "Singapore has a lot of street markets.":
        # country.remove(description_node)
# tree.write('sample.xml')

但现在我想创建一个条件,其中我可以删除所有国家节点中的描述元素或年份元素或邻居元素。

英文:

For example sake, this is the xml file that I'm working with:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
        <description>Liechtenstein has a lot of flowers.</description>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
        <description>Singapore has a lot of street markets.</description>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
        <description>Panama has a lot of great food.</description>
    </country>
</data>

How would I write the code such that I could delete one node element (i.e. year or description) across each of the country nodes. For example, in the following code:

# To remove 
# for country in root.findall('country'):
	# year = int(country.find('year').text)
	# if year > 2010:
		# root.remove(country)
# tree.write('sample.xml')

I can remove any country nodes whose attribute of the element year is greater than 2010. But that removes the entire node, not just the year element. I know that I can remove a single element of a node with the following:

# for country in root.findall('country'):
	# description_node = country.find('description')
	# if description_node.text == "Singapore has a lot of street markets.":
		# country.remove(description_node)
# tree.write('sample.xml')

But now I want to create a condition where I delete the description element or the year element or the neighbor element throughout all of the country nodes present.

答案1

得分: 0

以下是代码部分的翻译:

import xml.etree.ElementTree as ET

file = 'source.xml'
data = ET.parse(file)

for country in data.findall('country'):
    for neighbor in country.findall('neighbor'):
        country.remove(neighbor)
    for year in country.findall('year'):
        country.remove(year)
    for description in country.findall('description'):
        country.remove(description)

ET.dump(data)

输出部分不需要翻译。

英文:

One option might be the following that uses .findall and .remove:

import xml.etree.ElementTree as ET

file = 'source.xml'
data = ET.parse(file)

for country in data.findall('country'):
    for neighbor in country.findall('neighbor'):
        country.remove(neighbor)
    for year in country.findall('year'):
        country.remove(year)
    for description in country.findall('description'):
        country.remove(description)

ET.dump(data)

Output:

python yourscript.py 
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <gdppc>141100</gdppc>
        </country>
    <country name="Singapore">
        <rank>4</rank>
        <gdppc>59900</gdppc>
        </country>
    <country name="Panama">
        <rank>68</rank>
        <gdppc>13600</gdppc>
        </country>
</data>

答案2

得分: 0

在XSLT 3.0中,例如,您可以这样做:

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   version="3.0">
  <xsl:mode on-no-match="shallow-copy"/>
  <xsl:template match="year[. > 2000]"/>
</xsl:transform>

空模板规则会导致与谓词匹配的元素被移除;xsl:mode指令会导致其他一切保留下来。

英文:

In XSLT 3.0 you can do, for example:

&lt;xsl:transform xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;
   version=&quot;3.0&quot;&gt;
  &lt;xsl:mode on-no-match=&quot;shallow-copy&quot;/&gt;
  &lt;xsl:template match=&quot;year[. &gt; 2000]&quot;/&gt;
&lt;/xsl:transform&gt;

The empty template rule causes elements that match the predicate to be removed; the xsl:mode instruction causes everything else to be retained.

huangapple
  • 本文由 发表于 2023年2月16日 03:18:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75464516.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定