英文:
Selenium:Get the text on a section class if only if is available
问题
需要再次帮忙,我在正在构建的代码中再次遇到了问题。
这里的复杂性... 并非所有我将在这里使用的HTML都具有可用的部分...
<section class="pv-contact-info__contact-type ci-phone">
<li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
<svg
xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
<path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
</svg>
</li-icon>
<h3 class="pv-contact-info__header t-16 t-black t-bold">
Phone
</h3>
<ul class="list-style-none">
<li class="pv-contact-info__ci-container t-14">
<span class="t-14 t-black t-normal">
+391234567891
</span>
<span class="t-14 t-black--light t-normal">
(Mobile)
</span>
</li>
</ul>
</section>
因此,我尝试添加一个条件,如果该部分不存在,则写入“此提供商无电话号码可用”。
这是我为此编写的代码:
try:
phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
phone = phone_element.get_attribute("innerHTML").strip()
except NoSuchElementException:
phone = "此提供商无电话号码可用"
我正在将该变量保存到CSV文件中。
但是,当我运行代码时,我收到以下异常。
phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/support/wait.py", line 95, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException
在我添加该条件之前,一切都运行得很顺利。
英文:
need some help again, I got stuck again in a code I'm building.
The complexity here... not for all the HTMLs I will use here have the section available...
> <section class="pv-contact-info__contact-type ci-phone>
<section class="pv-contact-info__contact-type ci-phone">
<li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
<svg
xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
<path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
</svg>
</li-icon>
<h3 class="pv-contact-info__header t-16 t-black t-bold">
Phone
</h3>
<ul class="list-style-none">
<li class="pv-contact-info__ci-container t-14">
<span class="t-14 t-black t-normal">
+391234567891
</span>
<span class="t-14 t-black--light t-normal">
(Mobile)
</span>
</li>
</ul>
</section>
So Im trying to add a condition for it, like if the section doesn't exist, so write no Phone numb not available for this provider
This is the code I have done for it
try:
phone_element = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//span[@class='t-14 t-black t-normal']")))
phone = phone_element.get_attribute("innerHTML").strip()
except NoSuchElementException:
phone = "Phone numb not available for this provider"
I am saving that variable into a CSV file.
But I'm getting the following exception when I run the code.
> phone_element = WebDriverWait(driver,
> 10).until(EC.presence_of_element_located((By.XPATH,
> "//section[@class='pv-contact-info__contact-type
> ci-phone']//span[@class='t-14 t-black t-normal']"))) File
> "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/selenium/webdriver/support/wait.py",
> line 95, in until
> raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException
Everything was working finer until I added that condition to the code.
答案1
得分: 1
以下是翻译的内容:
给定的HTML代码如下:
<section class="pv-contact-info__contact-type ci-phone">
<li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
<path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
</svg>
</li-icon>
<h3 class="pv-contact-info__header t-16 t-black t-bold">
电话
</h3>
<ul class="list-style-none">
<li class="pv-contact-info__ci-container t-14">
<span class="t-14 t-black t-normal">
+391234567891
</span>
<span class="t-14 t-black--light t-normal">
(手机)
</span>
</li>
</ul>
</section>
要提取文本,如果<section>
可用,而不是使用_presence_of_element_located()
,您需要使用WebDriverWait来等待visibility_of_element_located(),并且您可以使用以下任一定位策略之一:
- 代码块:
try:
phone_element = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//h3[contains(., '电话')]//following-sibling::ul[1]//li/span")))
phone = phone_element.get_attribute("innerHTML").strip()
print(phone)
except TimeoutException:
print("该提供商没有提供电话号码")
- 注意:您需要添加以下导入语句:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
有关如何使用Selenium检索WebElement的文本的相关讨论,请参见How to retrieve the text of a WebElement using Selenium - Python。
参考资料
有关有用文档的链接:
get_attribute()
方法获取元素的给定属性或属性。
text
属性返回元素的文本。
- 使用Selenium获取文本和innerHTML之间的区别。
英文:
Given the HTML:
<section class="pv-contact-info__contact-type ci-phone">
<li-icon aria-hidden="true" type="phone-handset" class="pv-contact-info__contact-icon">
<svg
xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" data-supported-dps="24x24" fill="currentColor" class="mercado-match" width="24" height="24" focusable="false">
<path d="M21.7 19.18l-1.92 1.92a3.07 3.07 0 01-3.33.67 25.52 25.52 0 01-8.59-5.63 25.52 25.52 0 01-5.63-8.59 3.07 3.07 0 01.67-3.33L4.82 2.3a1 1 0 011.41 0l3.15 3.11A1.1 1.1 0 019.41 7L7.59 8.73a20.51 20.51 0 007.68 7.68l1.78-1.79a1.1 1.1 0 011.54 0l3.11 3.11a1 1 0 010 1.41z"></path>
</svg>
</li-icon>
<h3 class="pv-contact-info__header t-16 t-black t-bold">
Phone
</h3>
<ul class="list-style-none">
<li class="pv-contact-info__ci-container t-14">
<span class="t-14 t-black t-normal">
+391234567891
</span>
<span class="t-14 t-black--light t-normal">
(Mobile)
</span>
</li>
</ul>
</section>
To extract the text incase the <section>
is available, instead of presence_of_element_located() you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:
-
Code block:
try: phone_element = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//section[@class='pv-contact-info__contact-type ci-phone']//h3[contains(., 'Phone')]//following-sibling::ul[1]//li/span"))) phone = phone_element.get_attribute("innerHTML").strip() print(phone) except TimeoutException: print("Phone numb not available for this provider")
-
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
>You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
References
Link to useful documentation:
get_attribute()
methodGets the given attribute or property of the element.
text
attribute returnsThe text of the element.
- Difference between text and innerHTML using Selenium
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论