Xpath使用contains在未关闭的br标签之后

huangapple go评论54阅读模式
英文:

Xpath using contains after an unclosed br tag

问题

以下是翻译的内容:

我正在使用XPath和Selenium测试以下标记:

<table>
  <tbody>
    <tr>
      <td>
        123
        <br>
        456
      </td>
    </tr>
  </tbody>
</table>

虽然这个XPath可以匹配:

//table/tbody/tr/td[contains(text(),'123')]

但这个却不匹配:

//table/tbody/tr/td[contains(text(),'456')]

我只是在Chrome开发者工具中使用"find"来测试我的XPath表达式,所以我想这不是一个Selenium的问题,而是我的最终目标是创建一个函数,如果在表格中找到值则返回true,否则返回false。

我猜想未闭合的<br>可能会导致问题,但我无法更改标记,所以我需要处理我得到的标记。是否有人有任何方法可以查找类似标记中的值是否存在于表格中的想法?

我预计可能需要获取每个单元格中的所有值(希望可以返回<br>之前和之后的内容),然后在我的代码中对它们进行循环处理。

英文:

I am testing the following markup using xpath and selenium:

<table>
  <tbody>
    <tr>
      <td>
        123
        <br>
        456
      </td>
    </tr>
  </tbody>
</table>

and while this xpath works/matches:

//table/tbody/tr/td[contains(text(),'123')]

This one does not match:

//table/tbody/tr/td[contains(text(),'456')]

I'm just using "find" in chrome developer tools to test my xpath expressions so I suppose it's not a selenium question but my eventual intent is to make a function that returns true if a value is found in the table, false if not found.

My guess is that the unclosed <br> is messing things up but I dont have the ability to change the markup so I need to play the cards that I was dealt. Does anyone have any ideas of how to find out if a value exists in the table with similar markup?

I expect that I might have to get all values in every cell (which hopefully returns the stuff before and after the <br>) and then loop through them in my code.

答案1

得分: 1

The problem is not that the br tag is "unclosed". In fact, the browser has parsed that empty <br> element, and your XPath is operating over a data model that is indistinguishable from one that an XML parser would produce from either <br/> or <br></br>.

The result is that you have a td element node which contains 3 child nodes:

  • a text node 123
  • an empty <br> element
  • another text node 456

The problem you're seeing is that the contains function is converting its first argument text() into a string, as if by a call to the string function, and the rule here is that the string value of a set of nodes is defined to be the string value of the member of that set which is first in document order; in your example, that is the text node 123. So the second (and any subsequent) text node would be ignored. Once again, that would be true whether the intervening element were a br, or an img, or a span, or anything else.

One solution, if you want to check if the td contains the string 456, and you don't care about how that string is broken up by child elements or whether it might be nested inside some other element (e.g., <span>456</span>) is to pass the entire string value of the td element as the first parameter to contains:

//table/tbody/tr/td[contains(.,'456')]

If you only wanted to know if the string 456 occurred in a text node which was a direct child of the td (and ignore the case where it was nested inside a span inside that td, for instance), you could use an XPath like this:

//table/tbody/tr/td[text()[contains(.,'456')]]
英文:

The problem is not that the br tag is "unclosed". In fact the browser has parsed that empty &lt;br&gt; element and your XPath is operating over a data model which is indistinguishable from one that an XML parser would produce from either &lt;br/&gt; or &lt;br&gt;&lt;/br&gt;.

The result is that you have a td element node which contains 3 child nodes:

  • a text node 123
  • an empty &lt;br&gt; element
  • another text node 456

The problem you're seeing is that the contains function is converting its first argument text() into a string, as if by a call to the string function, and the rule here is that the string value of a set of nodes is defined to be the string value of the member of that set which is first in document order; in your example that is the text node 123. So the second (and any subsequent) text node would be ignored. Once again, that would be true whether the intervening element were a br, or an img or a span or anything else.

One solution, if you want to check if the td contains the string 456, and you don't care about how that string is broken up by child elements, or whether it might be nested inside some other element (e.g. &lt;span&gt;456&lt;/span&gt;) is to pass the entire string value of the td element as the first parameter to contains:

//table/tbody/tr/td[contains(.,&#39;456&#39;)]

If you only wanted to know if the string 456 occurred in a text node which was a direct child of the td (and ignore the case where it was nested inside a span inside that td, for instance), you could use an XPath like this:

//table/tbody/tr/td[text()[contains(.,&#39;456&#39;)]]

(Read this as "a td which has a text node child, which in turn contains the string 456")

答案2

得分: -1

Your assumption is correct. The <br> is causing the second locator to fail but there is a way around it.

text() gives you specifically the text inside the element itself. You can alternatively use . which basically "flattens" text inside all descendant elements. So, you can shorten both your locators to

//table/tbody/tr/td[contains(.,'123')]
//table/tbody/tr/td[contains(.,'456')]

and both will work.


NOTE: Your XPath might be overly specific. For example, by specifying the tbody layer you are excluding table headings. If you ONLY want to search the body of the table, then you're good.

英文:

Your assumption is correct. The &lt;br&gt; is causing the second locator to fail but there is a way around it.

text() gives you specifically the text inside the element itself. You can alternatively use . which basically "flattens" text inside all descendant elements. So, you can shorten both your locators to

//table/tbody/tr/td[contains(.,&#39;123&#39;)]
//table/tbody/tr/td[contains(.,&#39;456&#39;)]

and both will work.


NOTE: Your XPath might be overly specific. For example, by specifying the tbody layer you are excluding table headings. If you ONLY want to search the body of the table, then you're good.

huangapple
  • 本文由 发表于 2023年4月20日 05:37:42
  • 转载请务必保留本文链接:https://go.coder-hub.com/76058989.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定