英文:
Xpath using contains after an unclosed br tag
问题
以下是翻译的内容:
我正在使用XPath和Selenium测试以下标记:
<table>
<tbody>
<tr>
<td>
123
<br>
456
</td>
</tr>
</tbody>
</table>
虽然这个XPath可以匹配:
//table/tbody/tr/td[contains(text(),'123')]
但这个却不匹配:
//table/tbody/tr/td[contains(text(),'456')]
我只是在Chrome开发者工具中使用"find"来测试我的XPath表达式,所以我想这不是一个Selenium的问题,而是我的最终目标是创建一个函数,如果在表格中找到值则返回true,否则返回false。
我猜想未闭合的<br>
可能会导致问题,但我无法更改标记,所以我需要处理我得到的标记。是否有人有任何方法可以查找类似标记中的值是否存在于表格中的想法?
我预计可能需要获取每个单元格中的所有值(希望可以返回<br>
之前和之后的内容),然后在我的代码中对它们进行循环处理。
英文:
I am testing the following markup using xpath and selenium:
<table>
<tbody>
<tr>
<td>
123
<br>
456
</td>
</tr>
</tbody>
</table>
and while this xpath works/matches:
//table/tbody/tr/td[contains(text(),'123')]
This one does not match:
//table/tbody/tr/td[contains(text(),'456')]
I'm just using "find" in chrome developer tools to test my xpath expressions so I suppose it's not a selenium question but my eventual intent is to make a function that returns true if a value is found in the table, false if not found.
My guess is that the unclosed <br>
is messing things up but I dont have the ability to change the markup so I need to play the cards that I was dealt. Does anyone have any ideas of how to find out if a value exists in the table with similar markup?
I expect that I might have to get all values in every cell (which hopefully returns the stuff before and after the <br>
) and then loop through them in my code.
答案1
得分: 1
The problem is not that the br
tag is "unclosed". In fact, the browser has parsed that empty <br>
element, and your XPath is operating over a data model that is indistinguishable from one that an XML parser would produce from either <br/>
or <br></br>
.
The result is that you have a td
element node which contains 3 child nodes:
- a text node
123
- an empty
<br>
element - another text node
456
The problem you're seeing is that the contains
function is converting its first argument text()
into a string, as if by a call to the string
function, and the rule here is that the string value of a set of nodes is defined to be the string value of the member of that set which is first in document order; in your example, that is the text node 123
. So the second (and any subsequent) text node would be ignored. Once again, that would be true whether the intervening element were a br
, or an img
, or a span
, or anything else.
One solution, if you want to check if the td
contains the string 456, and you don't care about how that string is broken up by child elements or whether it might be nested inside some other element (e.g., <span>456</span>
) is to pass the entire string value of the td
element as the first parameter to contains
:
//table/tbody/tr/td[contains(.,'456')]
If you only wanted to know if the string 456
occurred in a text node which was a direct child of the td
(and ignore the case where it was nested inside a span
inside that td
, for instance), you could use an XPath like this:
//table/tbody/tr/td[text()[contains(.,'456')]]
英文:
The problem is not that the br
tag is "unclosed". In fact the browser has parsed that empty <br>
element and your XPath is operating over a data model which is indistinguishable from one that an XML parser would produce from either <br/>
or <br></br>
.
The result is that you have a td
element node which contains 3 child nodes:
- a text node
123
- an empty
<br>
element - another text node
456
The problem you're seeing is that the contains
function is converting its first argument text()
into a string, as if by a call to the string
function, and the rule here is that the string value of a set of nodes is defined to be the string value of the member of that set which is first in document order; in your example that is the text node 123
. So the second (and any subsequent) text node would be ignored. Once again, that would be true whether the intervening element were a br
, or an img
or a span
or anything else.
One solution, if you want to check if the td
contains the string 456, and you don't care about how that string is broken up by child elements, or whether it might be nested inside some other element (e.g. <span>456</span>
) is to pass the entire string value of the td
element as the first parameter to contains
:
//table/tbody/tr/td[contains(.,'456')]
If you only wanted to know if the string 456
occurred in a text node which was a direct child of the td
(and ignore the case where it was nested inside a span
inside that td
, for instance), you could use an XPath like this:
//table/tbody/tr/td[text()[contains(.,'456')]]
(Read this as "a td which has a text node child, which in turn contains the string 456")
答案2
得分: -1
Your assumption is correct. The <br>
is causing the second locator to fail but there is a way around it.
text()
gives you specifically the text inside the element itself. You can alternatively use .
which basically "flattens" text inside all descendant elements. So, you can shorten both your locators to
//table/tbody/tr/td[contains(.,'123')]
//table/tbody/tr/td[contains(.,'456')]
and both will work.
NOTE: Your XPath might be overly specific. For example, by specifying the tbody
layer you are excluding table headings. If you ONLY want to search the body of the table, then you're good.
英文:
Your assumption is correct. The <br>
is causing the second locator to fail but there is a way around it.
text()
gives you specifically the text inside the element itself. You can alternatively use .
which basically "flattens" text inside all descendant elements. So, you can shorten both your locators to
//table/tbody/tr/td[contains(.,'123')]
//table/tbody/tr/td[contains(.,'456')]
and both will work.
NOTE: Your XPath might be overly specific. For example, by specifying the tbody
layer you are excluding table headings. If you ONLY want to search the body of the table, then you're good.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论