Xpath无法检测带有非断空格的文本

huangapple go评论74阅读模式
英文:

Xpath doesn't detect text with nbsp

问题

<tr>
  <td align="left" width="200">
    <p>文档已上传:是</p>
  </td>
</tr>

我无法在文本中找到带有 &amp;nbsp; 的元素。下面的XPath表达式不起作用,我尝试了许多在线建议,但仍未成功。提醒:我需要的是整个文本,而不仅仅是其中的子字符串。

//p[contains(text(), '文档已上传:是')]
英文:
&lt;tr&gt;
 &lt;td align=&quot;left&quot; width=&quot;200&quot;&gt;
  &lt;p&gt;Document Uploaded:&amp;nbsp;Yes&lt;/p&gt;
 &lt;/td&gt;
&lt;/tr&gt;

I'm unable to locate the element with &amp;nbsp; in the text. The below XPath expression does not not work and I've tried so many other suggestions online but have not been successful yet. FYI: I need the entire text not just a substring of it.

//p[contains(text(), &#39;Document Uploaded: Yes&#39;)]

答案1

得分: 1

尝试使用以下代码:

//p[contains(text(), 'Document Uploaded: Yes')]

&amp;#160; 是命名字符引用 &amp;nbsp; 的数字字符引用(参见维基百科)。它还可以用在 XML/XSLT 文档开头的 DOCTYPE 声明中,以使 &amp;nbsp; 可用:

<!DOCTYPE stylesheet [
<!ENTITY nbsp  "&#160;" >
]>
英文:

Try using

//p[contains(text(), &#39;Document Uploaded:&amp;#160;Yes&#39;)]

&amp;#160; is the Numeric character reference for the Named character reference &amp;nbsp; (See Wikipedia). It can also be used in a DOCTYPE declaration at the beginning of an XML/XSLT document to make &amp;nbsp; usuable:

&lt;!DOCTYPE stylesheet [
&lt;!ENTITY nbsp  &quot;&amp;#160;&quot; &gt;
]&gt;

答案2

得分: 1

使用

//p[. = 'Document Uploaded: Yes']

基于 XSLT 的验证

此 XSLT 转换仅评估上述 XPath 表达式并输出评估结果:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output omit-xml-declaration="yes" indent="yes"/>

  <xsl:template match="/">
    <xsl:copy-of select="//p[. = 'Document Uploaded: Yes']"/>
  </xsl:template>
</xsl:stylesheet>

当应用于提供的 XML 文档时

<!DOCTYPE stylesheet [
<!ENTITY nbsp  "&#160;" >
]>
<tr>
 <td align="left" width="200">
  <p>Document Uploaded: Yes</p>
 </td>
</tr>

将产生所需的正确结果

<p>Document Uploaded:&#160;Yes</p>
英文:

Use:

//p[. = &#39;Document Uploaded:&amp;#xA0;Yes&#39;]

XSLT - based verification:

This XSLT transformation just evaluates the above XPath expression and outputs the result of the evaluation:

&lt;xsl:stylesheet version=&quot;1.0&quot; xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;&gt;
 &lt;xsl:output omit-xml-declaration=&quot;yes&quot; indent=&quot;yes&quot;/&gt;

  &lt;xsl:template match=&quot;/&quot;&gt;
    &lt;xsl:copy-of select=
    &quot;//p[. = &#39;Document Uploaded:&amp;#xA0;Yes&#39;]&quot;/&gt;
  &lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;

When applied on the provided XML document:

&lt;!DOCTYPE stylesheet [
&lt;!ENTITY nbsp  &quot;&amp;#160;&quot; &gt;
]&gt;
&lt;tr&gt;
 &lt;td align=&quot;left&quot; width=&quot;200&quot;&gt;
  &lt;p&gt;Document Uploaded:&amp;nbsp;Yes&lt;/p&gt;
 &lt;/td&gt;
&lt;/tr&gt;

the wanted, correct result is produced:

&lt;p&gt;Document Uploaded:&#160;Yes&lt;/p&gt;

答案3

得分: 0

以下是翻译好的部分:

另一个解决方案是绕过字符处理:

//p[contains(text(),'Document Uploaded:')][contains(text(),'Yes')]

使用一个 XPath 来查找同时包含你所需字符串的元素。如果你希望更严格,可以使用 starts-withends-with

另一种方法是不要通过文本来查找这个项目,而是获取元素并在 Java 中处理它。
我不能提供一个完整的示例,因为我需要查看更多的 HTML,但大致过程如下:

  • 找到一个固定的锚点,比如在你的 HTML 表格中其他容易识别的元素
  • 从那个锚点创建一个 XPath,使用 following-sibling::parent:: 或者 following::(你明白我的意思) - 以找到这个 Document Uploaded: 元素
  • 在 Java 中将它设置为你的 XPath,使用 findElement 方法。
  • 在 Java 中:myElement.getText()

我认为 Selenium 足够智能,可以去掉 nbsp 字符 - 但即使它不去掉,你仍然会得到一个文本字符串来确认你的文档上传状态。

英文:

Odd one this! The other answers mentioned above should word.

An alternative solution is to work around the character:

//p[contains(text(), &#39;Document Uploaded:&#39;)][contains(text(), &#39;Yes&#39;)]

Use an xpath that looks for an element that contains both string you need. If you need it to be more rigid, you can use starts-with and ends-with.

Another approach is don't find this item by text but rather get the element and process it in Java.
I can't give a working example as I would need to see more html but the process would be:

  • Find an fixed anchor point e.g. something else in your html table you can identify easily
  • Create an xpath from that anchor with following-sibling:: or parent:: or following:: (you get the idea) - whatever to find this Document Uploaded: element
  • set that as your xpath in java with findElement.
  • In java: myElement.getText()

I think Selenium is smart enough to strip out the nbsp character - but even if it does not you'll still have a text string confirming your document upload status.

huangapple
  • 本文由 发表于 2020年10月7日 03:17:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/64232423.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定