Selenium:仅在存在时获取部分类上的文本

huangapple go评论78阅读模式
英文:

Selenium:Get the text on a section class if only if is available

问题

需要再次帮忙,我在正在构建的代码中再次遇到了问题。

这里的复杂性... 并非所有我将在这里使用的HTML都具有可用的部分...

<section class="pv-contact-info__contact-type ci-phone">
    <li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
        <svg
            xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
            <path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
        </svg>
    </li-icon>
    <h3 class="pv-contact-info__header t-16 t-black t-bold">
            Phone
          </h3>
    <ul class="list-style-none">
        <li class="pv-contact-info__ci-container t-14">
            <span class="t-14 t-black t-normal">
                  +391234567891
                </span>
            <span class="t-14 t-black--light t-normal">
                    (Mobile)
                  </span>
        </li>
    </ul>
</section>

因此,我尝试添加一个条件,如果该部分不存在,则写入“此提供商无电话号码可用”。

这是我为此编写的代码:

try:
    phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
    phone = phone_element.get_attribute("innerHTML").strip()
except NoSuchElementException:
    phone = "此提供商无电话号码可用"

我正在将该变量保存到CSV文件中。

但是,当我运行代码时,我收到以下异常。

phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 95, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException

在我添加该条件之前,一切都运行得很顺利。

英文:

need some help again, I got stuck again in a code I'm building.

The complexity here... not for all the HTMLs I will use here have the section available...

> &lt;section class="pv-contact-info__contact-type ci-phone&gt;

&lt;section class=&quot;pv-contact-info__contact-type ci-phone&quot;&gt;
    &lt;li-icon aria-hidden=&quot;true&quot; type=&quot;phone-handset&quot; class=&quot;pv-contact-info__contact-icon&quot;&gt;
        &lt;svg
            xmlns=&quot;http://www.w3.org/2000/svg&quot; viewBox=&quot;0 0 24 24&quot; data-supported-dps=&quot;24x24&quot; fill=&quot;currentColor&quot; class=&quot;mercado-match&quot; width=&quot;24&quot; height=&quot;24&quot; focusable=&quot;false&quot;&gt;
            &lt;path d=&quot;M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z&quot;&gt;&lt;/path&gt;
        &lt;/svg&gt;
    &lt;/li-icon&gt;
    &lt;h3 class=&quot;pv-contact-info__header t-16 t-black t-bold&quot;&gt;
            Phone
          &lt;/h3&gt;
    &lt;ul class=&quot;list-style-none&quot;&gt;
        &lt;li class=&quot;pv-contact-info__ci-container t-14&quot;&gt;
            &lt;span class=&quot;t-14 t-black t-normal&quot;&gt;
                  +391234567891
                &lt;/span&gt;
            &lt;span class=&quot;t-14 t-black--light t-normal&quot;&gt;
                    (Mobile)
                  &lt;/span&gt;
        &lt;/li&gt;
    &lt;/ul&gt;
&lt;/section&gt;

So Im trying to add a condition for it, like if the section doesn't exist, so write no Phone numb not available for this provider

This is the code I have done for it

try:
    phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, &quot;//section[@class=&#39;pv-contact-info__contact-type ci-phone&#39;]//span[@class=&#39;t-14 t-black t-normal&#39;]&quot;)))
    phone = phone_element.get_attribute(&quot;innerHTML&quot;).strip()

except NoSuchElementException:
    phone = &quot;Phone numb not available for this provider&quot;

I am saving that variable into a CSV file.

But I'm getting the following exception when I run the code.

> phone_element = WebDriverWait(driver,
> 10).until(EC.presence_of_element_located((By.XPATH,
> "//section[@class='pv-contact-info__contact-type
> ci-phone']//span[@class='t-14 t-black t-normal']"))) File
> "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/support/wait.py",
> line 95, in until
> raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException

Everything was working finer until I added that condition to the code.

答案1

得分: 1

以下是翻译的内容:

给定的HTML代码如下:

<section class="pv-contact-info__contact-type ci-phone">
    <li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
        <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
            <path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
        </svg>
    </li-icon>
    <h3 class="pv-contact-info__header t-16 t-black t-bold">
        电话
    </h3>
    <ul class="list-style-none">
        <li class="pv-contact-info__ci-container t-14">
            <span class="t-14 t-black t-normal">
                +391234567891
            </span>
            <span class="t-14 t-black--light t-normal">
                (手机)
            </span>
        </li>
    </ul>
</section>

要提取文本,如果<section>可用,而不是使用_presence_of_element_located(),您需要使用WebDriverWait来等待visibility_of_element_located(),并且您可以使用以下任一定位策略之一:

  • 代码块:
try:
    phone_element = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//h3[contains(., '电话')]//following-sibling::ul[1]//li/span")))
    phone = phone_element.get_attribute("innerHTML").strip()
    print(phone)
except TimeoutException:
    print("该提供商没有提供电话号码")
  • 注意:您需要添加以下导入语句:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

有关如何使用Selenium检索WebElement的文本的相关讨论,请参见How to retrieve the text of a WebElement using Selenium - Python


参考资料

有关有用文档的链接:

英文:

Given the HTML:

&lt;section class=&quot;pv-contact-info__contact-type ci-phone&quot;&gt;
    &lt;li-icon aria-hidden=&quot;true&quot; type=&quot;phone-handset&quot; class=&quot;pv-contact-info__contact-icon&quot;&gt;
	&lt;svg
	    xmlns=&quot;http://www.w3.org/2000/svg&quot; viewBox=&quot;0 0 24 24&quot; data-supported-dps=&quot;24x24&quot; fill=&quot;currentColor&quot; class=&quot;mercado-match&quot; width=&quot;24&quot; height=&quot;24&quot; focusable=&quot;false&quot;&gt;
	    &lt;path d=&quot;M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z&quot;&gt;&lt;/path&gt;
	&lt;/svg&gt;
    &lt;/li-icon&gt;
    &lt;h3 class=&quot;pv-contact-info__header t-16 t-black t-bold&quot;&gt;
	    Phone
    &lt;/h3&gt;
    &lt;ul class=&quot;list-style-none&quot;&gt;
	&lt;li class=&quot;pv-contact-info__ci-container t-14&quot;&gt;
	    &lt;span class=&quot;t-14 t-black t-normal&quot;&gt;
		  +391234567891
		&lt;/span&gt;
	    &lt;span class=&quot;t-14 t-black--light t-normal&quot;&gt;
		    (Mobile)
		  &lt;/span&gt;
	&lt;/li&gt;
    &lt;/ul&gt;
&lt;/section&gt;

To extract the text incase the &lt;section&gt; is available, instead of presence_of_element_located() you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Code block:

    try:
        phone_element = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, &quot;//section[@class=&#39;pv-contact-info__contact-type ci-phone&#39;]//h3[contains(., &#39;Phone&#39;)]//following-sibling::ul[1]//li/span&quot;)))
        phone = phone_element.get_attribute(&quot;innerHTML&quot;).strip()
        print(phone)
    
    except TimeoutException:
            print(&quot;Phone numb not available for this provider&quot;)
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

>You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


References

Link to useful documentation:

huangapple
  • 本文由 发表于 2023年6月26日 00:36:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/76551443.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定