Selenium有时选择不正确的页面特性。

huangapple go评论62阅读模式
英文:

Selenium sometimes selects the incorrect page feature

问题

我正在尝试从英国超市网站上抓取各种农产品的价格。在Asda网站上,有时我能够获取正确的价值,有时获取到错误的价值。

我相信页面上存在多个我正在搜索的类的实例(名为co-product__price-per-uom)。有一个主要实例,即我要找的那个,但还有指向网站上其他不同价值的其他物品的链接。似乎有时会先加载这些其他物品,而Selenium会捕捉它们,而不是我想要的主要物品。

这是我的代码。

price_raw = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, str(class_list[url_list.index(url)]))))
price_raw = str(price_raw.text)

我是否可以要求Selenium等待页面上的所有元素都加载完毕,然后获取所有带有名称value的元素,然后只获取,比如说,列表中的第一个?列表的顺序是否每次都相同,还是它们按加载顺序添加到列表中?感谢。

英文:

I am attempting to scrape the prices of various items of produce from UK supermarket websites. On the Asda website, I occasionally get the correct value, and occasionally the incorrect value.

I believe that there is more than one instance of the class I am searching for on the page (called co-product__price-per-uom). There is the main instance, the one I'm looking for, but there are also links to other items on the website with different values. It seems like sometimes these other items are being loaded first, and selenium is picking them up, rather than the main one I want.

Here is my code.

price_raw = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, str(class_list[url_list.index(url)]))))
            price_raw = str(price_raw.text)

Can I ask Selenium to wait until all elements on the page are loaded, get all the ones that have the name value and then just take, say, the first in the list? Will the list order be the same every time or will they be added to the list as they are loaded?
Thanks

答案1

得分: 0

你说得对,页面上有多个具有你要查找的类的产品。缩小选择范围的最简单方法是使用开发者控制台。在Chrome中,按F12打开开发者工具,然后按ESC打开控制台。在控制台中使用$$()来测试CSS选择器和$x()来测试XPath。在你的情况下运行$$(.co-product__price-per-uom),看到有4个匹配该定位器的元素。展开元素列表,然后将鼠标悬停在每个元素上,直到找到所需的元素。然后,查找DOM直到找到另一个可以唯一标识的元素,可以用它来区分你想要的元素和其他元素。有很多这样的元素,但我使用了

<div class="pdp-main-details__price-and-uom">

将其添加到你现有的CSS选择器中,得到

.pdp-main-details__price-and-uom .co-product__price-per-uom

这将只定位所需的元素。

如果你要与页面上的元素交互,如果要点击,你需要等待元素可点击,如果是其他操作,你需要等待元素可见,以防止抛出异常。"Presence"只表示它存在于DOM中,但不一定可见或可点击等。更新你的代码如下:

price_raw = WebDriverWait(browser, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, ".pdp-main-details__price-and-uom .co-product__price-per-uom"))).text
英文:

You were correct in that there are multiple products on the page that all have the class you were looking for. The easiest way to narrow it down to the one you want is to use the dev console. In Chrome, press F12 to open the dev tools and then press ESC to open the console. In the console use $$() to test CSS selectors and $x() to test XPaths. In your case run $$(.co-product__price-per-uom) and see that there are 4 elements that match that locator. Expand the element list and then hover over each one until the desired element is found. From there, look up the DOM until another element that can be uniquely identified is found that can be used to differentiate the element you want from the others. There are many such elements but I used

&lt;div class=&quot;pdp-main-details__price-and-uom&quot;&gt;

Add that to your existing CSS selector and get

.pdp-main-details__price-and-uom .co-product__price-per-uom

which locates only the desired element.

If you are interacting with an element on the page, you want to wait for clickable if you are clicking and visible for just about anything else to prevent exceptions being thrown. Presence just means it's in the DOM but not necessarily visible or clickable, etc. Updating your code,

price_raw = WebDriverWait(browser, 10).until(EC.visibility_of_element_located((By.CSS_SELECTOR, &quot;.pdp-main-details__price-and-uom .co-product__price-per-uom&quot;))).text

huangapple
  • 本文由 发表于 2023年2月26日 20:26:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/75571971.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定