我无法解析该项。 Selenium Python

huangapple go评论42阅读模式
英文:

I can't parse the item. Selenium Python

问题

I do this. First I parsed the main block, and then some of the elements from it, but I get the same element. Here is the code:

driver.get("site")
time.sleep(3)

diary_main = driver.find_elements(By.XPATH, "//div[@class='diary-day']")
for i in diary_main:
    diary_date = i.find_element(By.XPATH, "//div[@class='diary-day-date']").get_attribute("data-date")
    print(diary_date)
    print(i.text)
    time.sleep(2)

diary_date outputs the same item
Sample HTML:

<div class="diary-day">
    <div class="diary-day-date" data-date="Sunday"></div>
</div>
<div class="diary-day">
    <div class="diary-day-date" data-date="Monday"></div>
</div>
<div class="diary-day">
    <div class="diary-day-date" data-date="Tuesday"></div>
</div>
<div class="diary-day">
    <div class="diary-day-date" data-date="Wednesday"></div>
</div>
英文:

I do this. First I parsed the main block, and then some of the elements from it, but I get the same element. Here is the code:

driver.get(&quot;site&quot;)
time.sleep(3)

diary_main = driver.find_elements(By.XPATH, &quot;//div[@class=&#39;diary-day&#39;]&quot;)
for i in diary_main:
    diary_date = i.find_element(By.XPATH, &quot;//div[@class=&#39;diary-day-date&#39;]&quot;).get_attribute(&quot;data-date&quot;)
    print(diary_date)
    print(i.text)
    time.sleep(2)

diary_date outputs the same item
Sample HTML:

&lt;div class=&quot;diary-day&quot;&gt;
     &lt;div class=&quot;diary-day-date&quot; data-date=&quot;Sunday&quot;&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;diary-day&quot;&gt;
     &lt;div class=&quot;diary-day-date&quot; data-date=&quot;Monday&quot;&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;diary-day&quot;&gt;
     &lt;div class=&quot;diary-day-date&quot; data-date=&quot;Tuesday&quot;&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class=&quot;diary-day&quot;&gt;
     &lt;div class=&quot;diary-day-date&quot; data-date=&quot;Wednesday&quot;&gt;&lt;/div&gt;
&lt;/div&gt;

答案1

得分: 1

可能是diary_main变量的问题。它无法保留所有元素或存在时机问题。你可以尝试这样做,

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver.get("site")

#等待diary_main元素出现
diary_main = WebDriverWait(driver, 5).until(EC.presence_of_all_elements_located((By.XPATH, "//div[@class='diary-day']")))

for i in diary_main:
    diary_date = i.find_element(By.XPATH, ".//div[@class='diary-day-date']").get_attribute("data-date")
    print(diary_date)
    print(i.text)
    time.sleep(2)
英文:

Probably problem with diary_main variable. It can't persist all the elements or timing issues. You can try this,

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver.get(&quot;site&quot;)

#wait for diary_main elements to be present
diary_main = WebDriverWait(driver, 5).until(EC.presence_of_all_elements_located((By.XPATH, &quot;//div[@class=&#39;diary-day&#39;]&quot;)))

for i in diary_main:
    diary_date = i.find_element(By.XPATH, &quot;.//div[@class=&#39;diary-day-date&#39;]&quot;).get_attribute(&quot;data-date&quot;)
    print(diary_date)
    print(i.text) 
    time.sleep(2)

答案2

得分: 0

  • 使用_CSS选择器_:
print([my_elem.get_attribute("data-date") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.diary-day > div.diary-day-date"))])
  • 使用_XPATH_:
print([my_elem.get_attribute("data-date") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='diary-day']/div[@class='diary-day-date']"))])
  • 注意:您需要添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
英文:

To extract the values of the data-date attribute ideally you have to induce WebDriverWait for the visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:

  • Using CSS_SELECTOR:

    print([my_elem.get_attribute(&quot;data-date&quot;) for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, &quot;div.diary-day &gt; div.diary-day-date&quot;)))])
    
  • Using XPATH:

    print([my_elem.get_attribute(&quot;data-date&quot;) for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, &quot;//div[@class=&#39;diary-day&#39;]/div[@class=&#39;diary-day-date&#39;]&quot;)))])
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

huangapple
  • 本文由 发表于 2023年2月24日 02:51:20
  • 转载请务必保留本文链接:https://go.coder-hub.com/75549124.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定