英文:
Unable to click on the Search button using Selenium (Python)
问题
我正在尝试从以下网站中提取表格(使用Python + Selenium):
https://www.ser-ag.com/en/resources/notifications-market-participants/management-transactions.html#/
在我成功输入表格中的日期参数后,我无法通过.click()
点击“搜索”按钮。不知何故,Selenium找不到XPath。你能帮助我吗?
# Auf Button suchen klicken
# suchen = driver.find_element(By.XPATH, '//*[contains(@id="")]/div/div/div[1]/div/div[3]/div[1]/button[2]')
# suchen.click()
# time.sleep(2)
button_xpath = "//*[@id='vwHpLbo9TZFKRdgg']/div/div/div[1]/div/div[3]/div[1]/button[2]"
wait = WebDriverWait(driver, 10) # 等待最多10秒
button = wait.until(EC.presence_of_element_located((By.XPATH, button_xpath)))
button.click()
请注意,由于文本中包含HTML和代码片段,我已将其保留在翻译中。如果您需要进一步的帮助,请告诉我。
英文:
I am trying to scrape the table of the following website (using Python + Selenium):
https://www.ser-ag.com/en/resources/notifications-market-participants/management-transactions.html#/
After I have successfully entered the date parameters in the table, I cannot click the "Search" button via .click()
. Somehow Selenium does not find the Xpath. Can you help me?
# Auf Button suchen klicken
# suchen = driver.find_element(By.XPATH, '//*[contains(@id="")]/div/div/div[1]/div/div[3]/div[1]/button[2]')
# suchen.click()
# time.sleep(2)
button_xpath = "//*[@id='vwHpLbo9TZFKRdgg']/div/div/div[1]/div/div[3]/div[1]/button[2]"
wait = WebDriverWait(driver, 10) # Wait up to 10 seconds
button = wait.until(EC.presence_of_element_located((By.XPATH, button_xpath)))
button.click()
答案1
得分: 0
尝试这个,可能的原因是按钮在驱动器屏幕之外,这将“滚动”到按钮并点击,(按钮代码与您的代码相同)
pd:请发布您的错误消息
from selenium.webdriver.common.action_chains import ActionChains
button = wait.until(EC.presence_of_element_located((By.XPATH, button_xpath)))
ActionChains(self).move_to_element(button).click().perform()
英文:
Try this, probably the reason is button is outside the driver screen, this will "scroll" to button and click, (button code is same on your code)
pd: post your error message please
from selenium.webdriver.common.action_chains import ActionChains
button = wait.until(EC.presence_of_element_located((By.XPATH, button_xpath)))
ActionChains(self).move_to_element(button).click().perform()
答案2
得分: 0
你首先必须点击<kbd>接受所有 cookies</kbd>按钮,然后尝试定位并点击<kbd>搜索</kbd>按钮。
看下面的工作代码:
driver = webdriver.Chrome()
driver.get('https://www.ser-ag.com/en/resources/notifications-market-participants/management-transactions.html#/')
driver.maximize_window()
button_xpath = "//button[text()='搜索']"
wait = WebDriverWait(driver, 10) # 等待最多10秒
# 点击接受所有 cookies 按钮
Accept_All_Btn = wait.until(EC.element_to_be_clickable((By.ID, "onetrust-accept-btn-handler")))
driver.execute_script("arguments[0].click();", Accept_All_Btn)
button = wait.until(EC.element_to_be_clickable((By.XPATH, button_xpath)))
button.click()
英文:
You have to click on <kbd>Accept All cookies</kbd> button first, and then try to locate and click <kbd>Search</kbd> button.
See the below working code:
driver = webdriver.Chrome()
driver.get('https://www.ser-ag.com/en/resources/notifications-market-participants/management-transactions.html#/')
driver.maximize_window()
button_xpath = "//button[text()='Search']"
wait = WebDriverWait(driver, 10) # Wait up to 10 seconds
# Click on Accept All cookies button
Accept_All_Btn = wait.until(EC.element_to_be_clickable((By.ID, "onetrust-accept-btn-handler")))
driver.execute_script("arguments[0].click();", Accept_All_Btn)
button = wait.until(EC.element_to_be_clickable((By.XPATH, button_xpath)))
button.click()
答案3
得分: 0
以下是获取数据的一种方法(所有数据都包括在内):
import requests
import pandas as pd
headers= {
'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
}
url = 'https://www.ser-ag.com/sheldon/management_transactions/v1/overview.json?pageSize=3000&pageNumber=0&sortAttribute=byDate&fromDate=20220710&toDate=20230710'
r = requests.get(url, headers=headers)
df = pd.json_normalize(r.json(), record_path=['itemList'])
print(df)
终端中的结果:
buySellIndicator correcteeId correctorId ISIN notificationId notificationSubmitter notificationSubmitterId obligorFunctionCode obligorRelatedPartyInd securityDescription securityTypeCode transactionAmountCHF transactionConditions transactionDate transactionSize transactionAmountPerSecurityCHF
0 2 CH0019199550 T1N7700039 Alpine Select AG ALPINE 1 L 7 981.0 20230707 90 10.9
1 1 CH0019199550 T1N7700021 Alpine Select AG ALPINE 1 L 7 2889.0 20230707 270 10.7
2 1 CH0473243506 T1N7700013 ONE swiss bank SA SBP 1 Acquisition à la valeur nominale dans le cadre... 7 50000.0 Acquisition en raison de l'exercice de stock 20230707 50000 1.0
3 1 CH0022427626 T1N7600015 LEM Holding SA L 7 108495.0 20230706 50 2169.9
4 1 CH0022427626 T1N7500017 LEM Holding SA L 7 108660.0 20230705 50 2173.2
...
2050 1 T1M7B00046 CH0023405456 T1M7B00053 DUFRY AG DUFRY 1 L 7 1980000.0 20220711 60000 33.0
2051 1 T1M7B00012 CH0001341608 T1M7B00061 Hypothekarbank Lenzburg AG BKHYPOL 1 7 2639.0 20220711 1 2639.0
2052 1 CH0022427626 T1M7B00038 LEM Holding SA LEM 2 L 7 172570.0 20220711 100 1725.7
2053 2 CH0005059438 T1M7B00020 nebag ag NEBAG 1 7 32900.0 20220711 3500 9.4
2054 rows × 16 columns
数据从API端点的页面中提取:您需要爬取特定的端点。您可以通过检查页面上的脚本所发出的网络调用来找到它。
英文:
Here is one way of getting that data (all of it):
import requests
import pandas as pd
headers= {
'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36'
}
url = 'https://www.ser-ag.com/sheldon/management_transactions/v1/overview.json?pageSize=3000&pageNumber=0&sortAttribute=byDate&fromDate=20220710&toDate=20230710'
r = requests.get(url, headers=headers)
df = pd.json_normalize(r.json(), record_path=['itemList'])
print(df)
Result in terminal:
buySellIndicator correcteeId correctorId ISIN notificationId notificationSubmitter notificationSubmitterId obligorFunctionCode obligorRelatedPartyInd securityDescription securityTypeCode transactionAmountCHF transactionConditions transactionDate transactionSize transactionAmountPerSecurityCHF
0 2 CH0019199550 T1N7700039 Alpine Select AG ALPINE 1 L 7 981.0 20230707 90 10.9
1 1 CH0019199550 T1N7700021 Alpine Select AG ALPINE 1 L 7 2889.0 20230707 270 10.7
2 1 CH0473243506 T1N7700013 ONE swiss bank SA SBP 1 Acquisition a la valeur nominale dans le cadre... 7 50000.0 Acquisition en raison de l exercice de stock\n... 20230707 50000 1.0
3 1 CH0022427626 T1N7600015 LEM Holding SA LEM 2 L 7 108495.0 20230706 50 2169.9
4 1 CH0022427626 T1N7500017 LEM Holding SA LEM 2 L 7 108660.0 20230705 50 2173.2
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2049 1 CH0019396990 T1M7C00010 Ypsomed Holding AG YPSOMED 2 7 20100.0 20220712 150 134.0
2050 1 T1M7B00046 CH0023405456 T1M7B00053 DUFRY AG DUFRY 1 L 7 1980000.0 20220711 60000 33.0
2051 1 T1M7B00012 CH0001341608 T1M7B00061 Hypothekarbank Lenzburg AG BKHYPOL 1 7 2639.0 20220711 1 2639.0
2052 1 CH0022427626 T1M7B00038 LEM Holding SA LEM 2 L 7 172570.0 20220711 100 1725.7
2053 2 CH0005059438 T1M7B00020 nebag ag NEBAG 1 7 32900.0 20220711 3500 9.4
2054 rows × 16 columns
Data is being hydrated in page from an API endpoint: you need to scrape that particular endpoint. You can find it by inspecting the Network calls made by the scripts on that page.
答案4
得分: 0
以下是您要翻译的代码部分:
import time
from selenium import webdriver
from selenium.webdriver import ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
options = ChromeOptions()
options.add_argument("--start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
driver = webdriver.Chrome(options=options)
wait = WebDriverWait(driver, 10)
url = 'https://www.ser-ag.com/en/resources/notifications-market-participants/management-transactions.html#/'
driver.get(url)
# 等待并点击“接受所有Cookie”按钮
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'button#onetrust-accept-btn-handler'))).click()
# 等待并点击“搜索”按钮
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'button.Button')))[1].click()
time.sleep(5)
注意事项:
- 您可以使用简单的CSS选择器
button.Button
来定位<kbd>搜索</kbd>按钮。 - 有两个使用此选择器获取的按钮。第一个是<kbd>清除筛选</kbd>按钮,第二个是<kbd>搜索</kbd>按钮。
英文:
Here's the way you can try:
import time
from selenium import webdriver
from selenium.webdriver import ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait
options = ChromeOptions()
options.add_argument("--start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
driver = webdriver.Chrome(options=options)
wait = WebDriverWait(driver, 10)
url = 'https://www.ser-ag.com/en/resources/notifications-market-participants/management-transactions.html#/'
driver.get(url)
# wait and click on the "Accept all Cookies" button
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, 'button#onetrust-accept-btn-handler'))).click()
# wait and click on the "Search" button
wait.until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'button.Button')))[1].click()
time.sleep(5)
Things to notice:
- You can locate the <kbd>Search</kbd> button with a simple CSS Selector
button.Button
- there are two buttons that you get using the selector. The first is the <kbd>Clear Filter</kbd> button and 2nd one is the <kbd>Search</kbd> button.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论