如何在XPath和Python中使用preceding-sibling?它似乎显示错误的输出。

huangapple go评论64阅读模式
英文:

How preceding-sibling works in XPath and Python? It seems to display wrong output

问题

The error in your code is related to the XPath expression you are using to select the preceding sibling elements. In your current XPath expression, you are selecting only the immediate preceding sibling elements of the <T> element with R1='ABC3', but you want to select all preceding siblings of the <X> element.

To achieve the expected output, you should modify your XPath expression. Here's the corrected code:

from lxml import etree

tree = etree.parse('test.xml')
for i in tree.xpath("//X/Z/T[R1='ABC3']/ancestor::X/preceding-sibling::*"):
    print(i.tag, " - ", i.text)

With this modification, the code will select all preceding sibling elements of the <X> element that contains the <Z> element with a <T> element where R1='ABC3'. This should produce the desired output:

Y1  -  ABC1
Y2  -  ABC2
Y1  -  ABC7
Y2  -  ABC8

Now, it will correctly print all preceding siblings as expected.

英文:

For the XML data

&lt;X&gt;
 &lt;Y1&gt;ABC1&lt;/Y1&gt;
 &lt;Y2&gt;ABC2&lt;/Y2&gt;
 &lt;Z&gt;
   &lt;T&gt;
     &lt;R1&gt;ABC3&lt;/R1&gt;
     &lt;R2&gt;ABC4&lt;/R2&gt;
   &lt;/T&gt;
   &lt;T&gt;
     &lt;R1&gt;ABC5&lt;/R1&gt;
     &lt;R2&gt;ABC6&lt;/R2&gt;
   &lt;/T&gt;
 &lt;/Z&gt;
 &lt;Y1&gt;ABC7&lt;/Y1&gt;
 &lt;Y2&gt;ABC8&lt;/Y2&gt;
 &lt;Z&gt;
   &lt;T&gt;
     &lt;R1&gt;ABC3&lt;/R1&gt;
     &lt;R2&gt;ABC9&lt;/R2&gt;
   &lt;/T&gt;
   &lt;T&gt;
     &lt;R1&gt;ABC5&lt;/R1&gt;
     &lt;R2&gt;ABC9&lt;/R2&gt;
   &lt;/T&gt;
 &lt;/Z&gt;
&lt;/X&gt;

I wrote a sample python file like the below.

from lxml import etree
tree = etree.parse(&#39;test.xml&#39;)
for i in tree.xpath(&quot;//X/Z/T[R1=&#39;ABC3&#39;]/parent::*/preceding-sibling::*&quot;):
    print(i.tag, &quot; - &quot;, i.text)

I expected output like

Y1  -  ABC1
Y2  -  ABC2
Y1  -  ABC1
Y2  -  ABC2
Z  -  

Y1  -  ABC7
Y2  -  ABC8

but received one like

Y1  -  ABC1
Y2  -  ABC2
Z  -  

Y1  -  ABC7
Y2  -  ABC8

It should print all preceding sibling. For 1st match of "R1=ABC3",it should print Y1 and Y2. For 2nd match of "R1=ABC", it should print the 5 siblings. Total 7 elements should be printed.
What is the error here?

答案1

得分: 1

XPath 1.0 中有一个节点集的概念,其中每个 / 步骤基于节点标识消除重复,因此像你使用的单个 XPath 表达式不会返回包含相同节点两次的集合,任何重复的节点都会被消除。

在 XPath 2.0 中,虽然 / 步骤操作符仍然具有相同的重复消除语义,但有一个更一般化的序列概念,可以使用 for .. return (for $p in //X/Z/T[R1=&#39;ABC3&#39;]/parent::* return $p/preceding-sibling::*) 或在 XPath 3.1 中使用 ! (//X/Z/T[R1=&#39;ABC3&#39;]/parent::*!preceding-sibling::*) 来包含重复项,详见 https://xqueryfiddle.liberty-development.net/eiZQFoV

在 XPath 1.0 中,你需要在宿主语言的循环中使用多个 XPath 评估(例如 Python),或者在 Python 的情况下,你可以使用列表推导式 element_list = [el for parent in tree.xpath(&quot;//X/Z/T[R1=&#39;ABC3&#39;]/parent::*&quot;) for el in parent.xpath(&quot;preceding-sibling::*&quot;)]

英文:

XPath 1.0 has a concept of node-sets where each / step eliminates duplicates based on node identity so a single XPath expression as you have used will not give a set that contains the same node twice, any duplicates are eliminated.

In XPath 2.0, while of course the / step operator continues to have the same duplicate elimination semantics, there is a more generalized concept of sequences using for .. return (for $p in //X/Z/T[R1=&#39;ABC3&#39;]/parent::* return $p/preceding-sibling::*) or in XPath 3.1 ! (//X/Z/T[R1=&#39;ABC3&#39;]/parent::*!preceding-sibling::*) that would allow you to include duplicates, see https://xqueryfiddle.liberty-development.net/eiZQFoV.

In XPath 1.0 you would need to use several XPath evaluations in a loop of the host language (e.g. Python) or in the case of Python you could use list comprehensions element_list = [el for parent in tree.xpath(&quot;//X/Z/T[R1=&#39;ABC3&#39;]/parent::*&quot;) for el in parent.xpath(&quot;preceding-sibling::*&quot;)]
.

答案2

得分: 1

问题标记为 xslt,但您没有使用 XSLT。可以使用以下样式表来实现预期输出:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="utf-8"/>

<xsl:template match="/X">
    <xsl:for-each select="Z[T/R1='ABC3']">
        <xsl:for-each select="preceding-sibling::*">
            <xsl:value-of select="name()" />
            <xsl:text> - </xsl:text>
            <xsl:value-of select="text()" />
            <xsl:text>&#10;</xsl:text>
        </xsl:for-each>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

正如 Martin Honnen 的回答中所指出的,需要分别处理每个匹配节点的前面兄弟节点,以获取两个单独的列表。


还请注意,您的表达式:

Z/T[R1='ABC3']/parent::* 

是不必要的复杂:显然,匹配的 T 的父节点必须是 Z - 因此,您可以简单地写成:

Z[T/R1='ABC3']
英文:

The question is tagged xslt, but you're not using XSLT. The expected output can be achieved using the following stylesheet:

XSLT 1.0

&lt;xsl:stylesheet version=&quot;1.0&quot; 
xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;&gt;
&lt;xsl:output method=&quot;text&quot; encoding=&quot;utf-8&quot;/&gt;

&lt;xsl:template match=&quot;/X&quot;&gt;
	&lt;xsl:for-each select=&quot;Z[T/R1=&#39;ABC3&#39;]&quot;&gt;
		&lt;xsl:for-each select=&quot;preceding-sibling::*&quot;&gt;
			&lt;xsl:value-of select=&quot;name()&quot; /&gt;
			&lt;xsl:text&gt; - &lt;/xsl:text&gt;
			&lt;xsl:value-of select=&quot;text()&quot; /&gt;
			&lt;xsl:text&gt;&amp;#10;&lt;/xsl:text&gt;
		&lt;/xsl:for-each&gt;
	&lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;

As noted in the answer by Martin Honnen, it is necessary to process the preceding siblings of each matched node separately, in order to get two separate lists.


Note also that your expression:

Z/T[R1=&#39;ABC3&#39;]/parent::* 

is unnecessarily convoluted: clearly, the parent of the matched T must be Z - so you can write simply:

Z[T/R1=&#39;ABC3&#39;]

huangapple
  • 本文由 发表于 2020年1月3日 19:42:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/59578014.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定