Selenium:仅在存在时获取部分类上的文本

huangapple go评论113阅读模式
英文:

Selenium:Get the text on a section class if only if is available

问题

需要再次帮忙,我在正在构建的代码中再次遇到了问题。

这里的复杂性... 并非所有我将在这里使用的HTML都具有可用的部分...

  1. <section class="pv-contact-info__contact-type ci-phone">
  2. <li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
  3. <svg
  4. xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
  5. <path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
  6. </svg>
  7. </li-icon>
  8. <h3 class="pv-contact-info__header t-16 t-black t-bold">
  9. Phone
  10. </h3>
  11. <ul class="list-style-none">
  12. <li class="pv-contact-info__ci-container t-14">
  13. <span class="t-14 t-black t-normal">
  14. +391234567891
  15. </span>
  16. <span class="t-14 t-black--light t-normal">
  17. (Mobile)
  18. </span>
  19. </li>
  20. </ul>
  21. </section>

因此,我尝试添加一个条件,如果该部分不存在,则写入“此提供商无电话号码可用”。

这是我为此编写的代码:

  1. try:
  2. phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
  3. phone = phone_element.get_attribute("innerHTML").strip()
  4. except NoSuchElementException:
  5. phone = "此提供商无电话号码可用"

我正在将该变量保存到CSV文件中。

但是,当我运行代码时,我收到以下异常。

  1. phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
  2. File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 95, in until
  3. raise TimeoutException(message, screen, stacktrace)
  4. selenium.common.exceptions.TimeoutException

在我添加该条件之前,一切都运行得很顺利。

英文:

need some help again, I got stuck again in a code I'm building.

The complexity here... not for all the HTMLs I will use here have the section available...

> &lt;section class="pv-contact-info__contact-type ci-phone&gt;

  1. &lt;section class=&quot;pv-contact-info__contact-type ci-phone&quot;&gt;
  2. &lt;li-icon aria-hidden=&quot;true&quot; type=&quot;phone-handset&quot; class=&quot;pv-contact-info__contact-icon&quot;&gt;
  3. &lt;svg
  4. xmlns=&quot;http://www.w3.org/2000/svg&quot; viewBox=&quot;0 0 24 24&quot; data-supported-dps=&quot;24x24&quot; fill=&quot;currentColor&quot; class=&quot;mercado-match&quot; width=&quot;24&quot; height=&quot;24&quot; focusable=&quot;false&quot;&gt;
  5. &lt;path d=&quot;M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z&quot;&gt;&lt;/path&gt;
  6. &lt;/svg&gt;
  7. &lt;/li-icon&gt;
  8. &lt;h3 class=&quot;pv-contact-info__header t-16 t-black t-bold&quot;&gt;
  9. Phone
  10. &lt;/h3&gt;
  11. &lt;ul class=&quot;list-style-none&quot;&gt;
  12. &lt;li class=&quot;pv-contact-info__ci-container t-14&quot;&gt;
  13. &lt;span class=&quot;t-14 t-black t-normal&quot;&gt;
  14. +391234567891
  15. &lt;/span&gt;
  16. &lt;span class=&quot;t-14 t-black--light t-normal&quot;&gt;
  17. (Mobile)
  18. &lt;/span&gt;
  19. &lt;/li&gt;
  20. &lt;/ul&gt;
  21. &lt;/section&gt;

So Im trying to add a condition for it, like if the section doesn't exist, so write no Phone numb not available for this provider

This is the code I have done for it

  1. try:
  2. phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, &quot;//section[@class=&#39;pv-contact-info__contact-type ci-phone&#39;]//span[@class=&#39;t-14 t-black t-normal&#39;]&quot;)))
  3. phone = phone_element.get_attribute(&quot;innerHTML&quot;).strip()
  4. except NoSuchElementException:
  5. phone = &quot;Phone numb not available for this provider&quot;

I am saving that variable into a CSV file.

But I'm getting the following exception when I run the code.

> phone_element = WebDriverWait(driver,
> 10).until(EC.presence_of_element_located((By.XPATH,
> "//section[@class='pv-contact-info__contact-type
> ci-phone']//span[@class='t-14 t-black t-normal']"))) File
> "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/support/wait.py",
> line 95, in until
> raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException

Everything was working finer until I added that condition to the code.

答案1

得分: 1

以下是翻译的内容:

给定的HTML代码如下:

  1. <section class="pv-contact-info__contact-type ci-phone">
  2. <li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
  3. <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
  4. <path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
  5. </svg>
  6. </li-icon>
  7. <h3 class="pv-contact-info__header t-16 t-black t-bold">
  8. 电话
  9. </h3>
  10. <ul class="list-style-none">
  11. <li class="pv-contact-info__ci-container t-14">
  12. <span class="t-14 t-black t-normal">
  13. +391234567891
  14. </span>
  15. <span class="t-14 t-black--light t-normal">
  16. (手机)
  17. </span>
  18. </li>
  19. </ul>
  20. </section>

要提取文本,如果<section>可用,而不是使用_presence_of_element_located(),您需要使用WebDriverWait来等待visibility_of_element_located(),并且您可以使用以下任一定位策略之一:

  • 代码块:
  1. try:
  2. phone_element = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//h3[contains(., '电话')]//following-sibling::ul[1]//li/span")))
  3. phone = phone_element.get_attribute("innerHTML").strip()
  4. print(phone)
  5. except TimeoutException:
  6. print("该提供商没有提供电话号码")
  • 注意:您需要添加以下导入语句:
  1. from selenium.webdriver.support.ui import WebDriverWait
  2. from selenium.webdriver.common.by import By
  3. from selenium.webdriver.support import expected_conditions as EC

有关如何使用Selenium检索WebElement的文本的相关讨论,请参见How to retrieve the text of a WebElement using Selenium - Python


参考资料

有关有用文档的链接:

英文:

Given the HTML:

  1. &lt;section class=&quot;pv-contact-info__contact-type ci-phone&quot;&gt;
  2. &lt;li-icon aria-hidden=&quot;true&quot; type=&quot;phone-handset&quot; class=&quot;pv-contact-info__contact-icon&quot;&gt;
  3. &lt;svg
  4. xmlns=&quot;http://www.w3.org/2000/svg&quot; viewBox=&quot;0 0 24 24&quot; data-supported-dps=&quot;24x24&quot; fill=&quot;currentColor&quot; class=&quot;mercado-match&quot; width=&quot;24&quot; height=&quot;24&quot; focusable=&quot;false&quot;&gt;
  5. &lt;path d=&quot;M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z&quot;&gt;&lt;/path&gt;
  6. &lt;/svg&gt;
  7. &lt;/li-icon&gt;
  8. &lt;h3 class=&quot;pv-contact-info__header t-16 t-black t-bold&quot;&gt;
  9. Phone
  10. &lt;/h3&gt;
  11. &lt;ul class=&quot;list-style-none&quot;&gt;
  12. &lt;li class=&quot;pv-contact-info__ci-container t-14&quot;&gt;
  13. &lt;span class=&quot;t-14 t-black t-normal&quot;&gt;
  14. +391234567891
  15. &lt;/span&gt;
  16. &lt;span class=&quot;t-14 t-black--light t-normal&quot;&gt;
  17. (Mobile)
  18. &lt;/span&gt;
  19. &lt;/li&gt;
  20. &lt;/ul&gt;
  21. &lt;/section&gt;

To extract the text incase the &lt;section&gt; is available, instead of presence_of_element_located() you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Code block:

    1. try:
    2. phone_element = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, &quot;//section[@class=&#39;pv-contact-info__contact-type ci-phone&#39;]//h3[contains(., &#39;Phone&#39;)]//following-sibling::ul[1]//li/span&quot;)))
    3. phone = phone_element.get_attribute(&quot;innerHTML&quot;).strip()
    4. print(phone)
    5. except TimeoutException:
    6. print(&quot;Phone numb not available for this provider&quot;)
  • Note : You have to add the following imports :

    1. from selenium.webdriver.support.ui import WebDriverWait
    2. from selenium.webdriver.common.by import By
    3. from selenium.webdriver.support import expected_conditions as EC

>You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

huangapple
  • 本文由 发表于 2023年6月26日 00:36:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76551443.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定