英文:
Capture all data in tr tag within binance.com using Python Selenium
问题
无法读取Binance期货页面上tbody标签中的所有数据,使用Python的Selenium。我尝试抓取这个链接:https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8
我使用了以下命令:
tr = driver.find_elements(By.TAG_NAME, 'tbody')
但没有文本输出。
我试图将tbody标签下的所有tr标签中的数据存储在数组或列表对象中。我还需要知道链接中有多少个tr标签。
英文:
I am unable to read all the data in the tbody tag on the Binance futures page using python selenium. I try to scrape this link: https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8
I used to command below:
tr = driver.find_elements(By.TAG_NAME,'tbody')
but there is no text output.
I'm trying to get all the data in the tr tags under the tbody tag in an array or an list object. I also need to know how many tr tag in the link.
答案1
得分: 1
为了获取<tbody>
标签内的所有<tr>
标签中的数据,您需要使用WebDriverWait来等待visibility_of_all_elements_located(),并且您可以使用以下任一定位策略之一:
- 使用_CSS_SELECTOR_和
get_attribute("innerHTML")
:
driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tbody.bn-table-tbody tr")))
for element in elements:
print(element.get_attribute("textContent"))
driver.quit()
- 使用_XPATH_和_text_属性:
driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody[@class='bn-table-tbody']//tr")))
for element in elements:
print(element.get_attribute("textContent"))
driver.quit()
- 控制台输出:
SOLUSDT Perpetual Short20x703621.952020.65609,118.79 (125.4862%)2023-03-03 19:31:49Trade
ETHUSDT Perpetual Short30x385.3831,562.541,568.19-2,176.72 (-10.8052%)2023-03-05 04:03:30Trade
EOSUSDT Perpetual Short20x138526.51.2721.2078,996.67 (107.6456%)2023-03-04 05:12:13Trade
COCOSUSDT Perpetual Short10x33878.52.2631201.58500022,973.69 (427.8359%)2023-03-03 06:50:52Trade
SSVUSDT Perpetual Short10x1010.344.25224938.0900006,225.72 (161.7813%)2023-03-03 20:05:15Trade
- 注意:您需要添加以下导入语句:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
英文:
To get all the data in the <tr>
tags within the <tbody>
tag in a list object you need to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:
-
Using CSS_SELECTOR and
get_attribute("innerHTML")
:driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8") elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tbody.bn-table-tbody tr"))) for element in elements: print(element.get_attribute("textContent")) driver.quit()
-
Using XPATH and text attribute:
driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8") elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody[@class='bn-table-tbody']//tr"))) for element in elements: print(element.get_attribute("textContent")) driver.quit()
-
Console Output:
SOLUSDT Perpetual Short20x703621.952020.65609,118.79 (125.4862%)2023-03-03 19:31:49Trade ETHUSDT Perpetual Short30x385.3831,562.541,568.19-2,176.72 (-10.8052%)2023-03-05 04:03:30Trade EOSUSDT Perpetual Short20x138526.51.2721.2078,996.67 (107.6456%)2023-03-04 05:12:13Trade COCOSUSDT Perpetual Short10x33878.52.2631201.58500022,973.69 (427.8359%)2023-03-03 06:50:52Trade SSVUSDT Perpetual Short10x1010.344.25224938.0900006,225.72 (161.7813%)2023-03-03 20:05:15Trade
-
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
答案2
得分: 1
以下是代码的翻译部分:
from bs4 import BeautifulSoup
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options
url = f'https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8'
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/93.0.4577.82 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,"
"application/signed-exchange;v=b3;q=0.9",
}
def get_result(url, headers):
chrome_options = Options()
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(options=chrome_options, executable_path=".../chromedriver_linux64/chromedriver") # 填写你的chromedriver的路径
driver.get(url)
time.sleep(10)
html = driver.page_source
soup = BeautifulSoup(html, "lxml")
tbody = soup.find('tbody', class_='bn-table-tbody')
trs = tbody.find_all('tr')
data = list()
for tr in trs:
tr_key = tr.get('data-row-key')
if tr_key is None:
pass
else:
mid_data = list()
count = 0
mid_data.append(f'Symbol - {tr_key}')
tds = tr.find_all('td')
mid_data.append(f'td_count - {len(tds)}')
for td in tds:
count += 1
mid_data.append(f'td_{count} - {td.text}')
print(mid_data)
def main():
get_result(url=url, headers=headers)
if __name__ == "__main__":
main()
请注意,代码的翻译已经包括在原始内容中,没有额外的信息。
英文:
For your task, you can use selenium + BeautifulSoup. Open the page in selenium, wait for the page to load, and then use the received data as a 'soup' object. First we find 'tbody', then we search for all 'tr' and for each 'tr' we find all 'td'. We extract the data and write it to the list. The first element is 'Symbol', the second is the total number of 'td' elements in the section, and then all the data from the table. Code:
from bs4 import BeautifulSoup
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options
url = f'https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8'
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/93.0.4577.82 Safari/537.36",
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,"
"application/signed-exchange;v=b3;q=0.9",
}
def get_result(url, headers):
chrome_options = Options()
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('--no-sandbox')
driver = webdriver.Chrome(options=chrome_options, executable_path=".../chromedriver_linux64/chromedriver") # the path to your chromedriver
driver.get(url)
time.sleep(10)
html = driver.page_source
soup = BeautifulSoup(html, "lxml")
tbody = soup.find('tbody', class_='bn-table-tbody')
trs = tbody.find_all('tr')
data = list()
for tr in trs:
tr_key=tr.get('data-row-key')
if tr_key is None:
pass
else:
mid_data = list()
count=0
mid_data.append(f'Symbol - {tr_key}')
tds = tr.find_all('td')
mid_data.append(f'td_count - {len(tds)}')
for td in tds:
count+=1
mid_data.append(f'td_{count} - {td.text}')
print(mid_data)
def main():
get_result(url=url, headers=headers)
if __name__ == "__main__":
main()
Will return:
['Symbol - SOLUSDT', 'td_count - 7', 'td_1 - SOLUSDT Perpetual Short20x', 'td_2 - 7036', 'td_3 - 21.9520', 'td_4 - 20.4050', 'td_5 - 10,884.64\xa0(151.6288%)', 'td_6 - 2023-03-03 17:01:49', 'td_7 - Trade']
['Symbol - ETHUSDT', 'td_count - 7', 'td_1 - ETHUSDT Perpetual Short30x', 'td_2 - 385.383', 'td_3 - 1,562.54', 'td_4 - 1,564.50', 'td_5 - -754.66\xa0(-3.7549%)', 'td_6 - 2023-03-05 01:33:30', 'td_7 - Trade']
['Symbol - EOSUSDT', 'td_count - 7', 'td_1 - EOSUSDT Perpetual Short20x', 'td_2 - 138526.5', 'td_3 - 1.272', 'td_4 - 1.175', 'td_5 - 13,383.85\xa0(164.4547%)', 'td_6 - 2023-03-04 02:42:13', 'td_7 - Trade']
['Symbol - COCOSUSDT', 'td_count - 7', 'td_1 - COCOSUSDT Perpetual Short10x', 'td_2 - 33878.5', 'td_3 - 2.263120', 'td_4 - 1.534000', 'td_5 - 24,701.49\xa0(475.3063%)', 'td_6 - 2023-03-03 04:20:52', 'td_7 - Trade']
['Symbol - SSVUSDT', 'td_count - 7', 'td_1 - SSVUSDT Perpetual Short10x', 'td_2 - 1010.3', 'td_3 - 44.252249', 'td_4 - 38.808423', 'td_5 - 5,499.90\xa0(140.2743%)', 'td_6 - 2023-03-03 17:35:15', 'td_7 - Trade']
You can process the final data as you like.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论