2023年4月20日 05:37:42go评论73阅读模式

英文:

Xpath using contains after an unclosed br tag

问题

以下是翻译的内容：

我正在使用XPath和Selenium测试以下标记：

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;
        123
        &lt;br&gt;
        456
      &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

虽然这个XPath可以匹配：

//table/tbody/tr/td[contains(text(),&#39;123&#39;)]

但这个却不匹配：

//table/tbody/tr/td[contains(text(),&#39;456&#39;)]

我只是在Chrome开发者工具中使用"find"来测试我的XPath表达式，所以我想这不是一个Selenium的问题，而是我的最终目标是创建一个函数，如果在表格中找到值则返回true，否则返回false。

我猜想未闭合的 可能会导致问题，但我无法更改标记，所以我需要处理我得到的标记。是否有人有任何方法可以查找类似标记中的值是否存在于表格中的想法？

我预计可能需要获取每个单元格中的所有值（希望可以返回 之前和之后的内容），然后在我的代码中对它们进行循环处理。

英文:

I am testing the following markup using xpath and selenium:

&lt;table&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;
        123
        &lt;br&gt;
        456
      &lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

and while this xpath works/matches:

//table/tbody/tr/td[contains(text(),&#39;123&#39;)]

This one does not match:

//table/tbody/tr/td[contains(text(),&#39;456&#39;)]

I'm just using "find" in chrome developer tools to test my xpath expressions so I suppose it's not a selenium question but my eventual intent is to make a function that returns true if a value is found in the table, false if not found.

My guess is that the unclosed   is messing things up but I dont have the ability to change the markup so I need to play the cards that I was dealt. Does anyone have any ideas of how to find out if a value exists in the table with similar markup?

I expect that I might have to get all values in every cell (which hopefully returns the stuff before and after the  ) and then loop through them in my code.

答案1

得分: 1

The problem is not that the br tag is "unclosed". In fact, the browser has parsed that empty   element, and your XPath is operating over a data model that is indistinguishable from one that an XML parser would produce from either   or  .

The result is that you have a td element node which contains 3 child nodes:

a text node 123
an empty   element
another text node 456

The problem you're seeing is that the contains function is converting its first argument text() into a string, as if by a call to the string function, and the rule here is that the string value of a set of nodes is defined to be the string value of the member of that set which is first in document order; in your example, that is the text node 123. So the second (and any subsequent) text node would be ignored. Once again, that would be true whether the intervening element were a br, or an img, or a span, or anything else.

One solution, if you want to check if the td contains the string 456, and you don't care about how that string is broken up by child elements or whether it might be nested inside some other element (e.g., 456) is to pass the entire string value of the td element as the first parameter to contains:

//table/tbody/tr/td[contains(.,'456')]

If you only wanted to know if the string 456 occurred in a text node which was a direct child of the td (and ignore the case where it was nested inside a span inside that td, for instance), you could use an XPath like this:

//table/tbody/tr/td[text()[contains(.,'456')]]

英文:

The problem is not that the br tag is "unclosed". In fact the browser has parsed that empty   element and your XPath is operating over a data model which is indistinguishable from one that an XML parser would produce from either   or  .

The result is that you have a td element node which contains 3 child nodes:

a text node 123
an empty   element
another text node 456

The problem you're seeing is that the contains function is converting its first argument text() into a string, as if by a call to the string function, and the rule here is that the string value of a set of nodes is defined to be the string value of the member of that set which is first in document order; in your example that is the text node 123. So the second (and any subsequent) text node would be ignored. Once again, that would be true whether the intervening element were a br, or an img or a span or anything else.

One solution, if you want to check if the td contains the string 456, and you don't care about how that string is broken up by child elements, or whether it might be nested inside some other element (e.g. 456) is to pass the entire string value of the td element as the first parameter to contains:

//table/tbody/tr/td[contains(.,&#39;456&#39;)]

//table/tbody/tr/td[text()[contains(.,&#39;456&#39;)]]

(Read this as "a td which has a text node child, which in turn contains the string 456")

答案2

得分: -1

Your assumption is correct. The   is causing the second locator to fail but there is a way around it.

text() gives you specifically the text inside the element itself. You can alternatively use . which basically "flattens" text inside all descendant elements. So, you can shorten both your locators to

//table/tbody/tr/td[contains(.,'123')]
//table/tbody/tr/td[contains(.,'456')]

and both will work.

NOTE: Your XPath might be overly specific. For example, by specifying the tbody layer you are excluding table headings. If you ONLY want to search the body of the table, then you're good.

英文:

Your assumption is correct. The   is causing the second locator to fail but there is a way around it.

//table/tbody/tr/td[contains(.,&#39;123&#39;)]
//table/tbody/tr/td[contains(.,&#39;456&#39;)]

and both will work.

NOTE: Your XPath might be overly specific. For example, by specifying the tbody layer you are excluding table headings. If you ONLY want to search the body of the table, then you're good.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Xpath使用contains在未关闭的br标签之后

问题

答案1

答案2

寻找标签元素并点击它

Element found but it's not clicked and the test fails

在TESTNG和Java（selenium）中计算成功和失败的数量。

Selenium：仅在存在时获取部分类上的文本

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。