如何通过Python中的Selenium关闭可点击的弹出窗口以继续进行网页数据抓取

huangapple go评论94阅读模式
英文:

How to close clickable popup to continue scraping through Selenium in python

问题

  1. 我正在尝试使用`Selenium``python`中从网站上的表格中的可点击弹出窗口中提取信息到`pandas`数据框中如果弹出窗口包含信息似乎可以做到这一点
  2. ```python
  3. from selenium import webdriver
  4. from selenium.webdriver.support.wait import WebDriverWait
  5. from selenium.webdriver.common.by import By
  6. from selenium.webdriver.support import expected_conditions as EC
  7. from selenium.webdriver.support.select import Select
  8. import pandas as pd
  9. import time
  10. driver = webdriver.Chrome()
  11. driver.get('https://mspotrace.org.my/Sccs_list')
  12. time.sleep(20)
  13. # 选择最大条目数
  14. elem = driver.find_element_by_css_selector('select[name=dTable_length]')
  15. select = Select(elem)
  16. select.select_by_value('500')
  17. time.sleep(15)
  18. # 获取元素列表
  19. elements = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//a[@title='View on Map']")))
  20. # 循环遍历元素弹出窗口并将设施详情提取到DF
  21. pos = 0
  22. df = pd.DataFrame(columns=['facility_name', 'other_details'])
  23. try:
  24. for element in elements:
  25. data = []
  26. element.click()
  27. time.sleep(3)
  28. facility_name = driver.find_element_by_xpath('//h4[@class="modal-title"]').text
  29. other_details = driver.find_element_by_xpath('//div[@class="modal-body"]').text
  30. data.append(facility_name)
  31. data.append(other_details)
  32. df.loc[pos] = data
  33. WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[aria-label='Close'] > span"))).click() # 关闭弹出窗口
  34. time.sleep(10)
  35. pos += 1
  36. except:
  37. print("没有地理位置信息")
  38. pass
  39. print(df)

然而,有些情况下会出现如下图所示的窗口,我需要点击'OK'以恢复抓取网页上的其他行,但我似乎无法找到要点击的元素来执行此操作。

[![enter image description here][1]][1]

  1. <details>
  2. <summary>英文:</summary>
  3. I&#39;m trying to scrape some information from clickable popups in a table on a website into a `pandas` dataframe using `Selenium` in `python` and it seems to be able to do this if the popups have information.
  4. from selenium import webdriver
  5. from selenium.webdriver.support.wait import WebDriverWait
  6. from selenium.webdriver.common.by import By
  7. from selenium.webdriver.support import expected_conditions as EC
  8. from selenium.webdriver.support.select import Select
  9. import pandas as pd
  10. import time
  11. driver = webdriver.Chrome()
  12. driver.get(&#39;https://mspotrace.org.my/Sccs_list&#39;)
  13. time.sleep(20)
  14. # Select maximum number of entries
  15. elem = driver.find_element_by_css_selector(&#39;select[name=dTable_length]&#39;)
  16. select = Select(elem)
  17. select.select_by_value(&#39;500&#39;)
  18. time.sleep(15)
  19. # Get list of elements
  20. elements = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.XPATH, &quot;//a[@title=&#39;View on Map&#39;]&quot;)))
  21. # Loop through element popups and pull details of facilities into DF
  22. pos = 0
  23. df = pd.DataFrame(columns=[&#39;facility_name&#39;,&#39;other_details&#39;])
  24. try:
  25. for element in elements:
  26. data = []
  27. element.click()
  28. time.sleep(3)
  29. facility_name = driver.find_element_by_xpath(&#39;//h4[@class=&quot;modal-title&quot;]&#39;).text
  30. other_details = driver.find_element_by_xpath(&#39;//div[@class=&quot;modal-body&quot;]&#39;).text
  31. data.append(facility_name)
  32. data.append(other_details)
  33. df.loc[pos] = data
  34. WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, &quot;button[aria-label=&#39;Close&#39;] &gt; span&quot;))).click() # close popup window
  35. time.sleep(10)
  36. pos+=1
  37. except:
  38. print(&quot;No geo location information&quot;)
  39. pass
  40. print(df)
  41. However, there are cases when a window like below appears and I need to click &#39;OK&#39; on this to resume scraping the other rows on the web page but I can&#39;t seem to be able to find the element to click on to do this.
  42. [![enter image description here][1]][1]
  43. [1]: https://i.stack.imgur.com/NdPqk.png
  44. </details>
  45. # 答案1
  46. **得分**: 0
  47. Selenium驱动程序提供了切换到警报上下文并与之交互的方法:
  48. driver.switch_to().alert()
  49. 之后,根据警报类型,您可以执行所需的操作。要模拟单击“确定”:
  50. driver.switch_to().alert().accept()
  51. 更多信息请参阅[此处](https://www.browserstack.com/guide/alerts-and-popups-in-selenium)。
  52. <details>
  53. <summary>英文:</summary>
  54. Selenium driver provides methods to switch to alerts context and working with it:
  55. driver.switch_to().alert()
  56. After that, you can do whatever you want, depending on alert type. To simulate clicking on “OK”:
  57. driver.switch_to().alert().accept()
  58. More info [here](https://www.browserstack.com/guide/alerts-and-popups-in-selenium)
  59. </details>
  60. # 答案2
  61. **得分**: 0
  62. 可以尝试使用 Python:
  63. driver.switch_to.alert.accept()
  64. 但是,您的测试场景应该明确并且应该知道弹出窗口出现的位置。如果您不清楚,或者是“真正”随机的话,您可以检查一些在每个测试步骤中运行的钩子。
  65. <details>
  66. <summary>英文:</summary>
  67. can you try for Pythonenter code here:
  68. driver.switch_to.alert.accept()
  69. But, your test scenario should be clear and should know where this pop up appears. If you don&#39;t know and &quot;really&quot; random, you can check some hooks that running for each test step
  70. </details>

huangapple
  • 本文由 发表于 2023年2月16日 09:01:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/75466856.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定