2023年3月7日 06:14:41go评论145阅读模式

英文:

Capture all data in tr tag within binance.com using Python Selenium

问题

无法读取Binance期货页面上tbody标签中的所有数据，使用Python的Selenium。我尝试抓取这个链接：https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8

我使用了以下命令：

tr = driver.find_elements(By.TAG_NAME, 'tbody')

但没有文本输出。

我试图将tbody标签下的所有tr标签中的数据存储在数组或列表对象中。我还需要知道链接中有多少个tr标签。

英文:

I am unable to read all the data in the tbody tag on the Binance futures page using python selenium. I try to scrape this link: https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8

I used to command below:

tr = driver.find_elements(By.TAG_NAME,&#39;tbody&#39;)

but there is no text output.

I'm trying to get all the data in the tr tags under the tbody tag in an array or an list object. I also need to know how many tr tag in the link.

答案1

得分: 1

为了获取<tbody>标签内的所有<tr>标签中的数据，您需要使用WebDriverWait来等待visibility_of_all_elements_located()，并且您可以使用以下任一定位策略之一：

使用_CSS_SELECTOR_和get_attribute("innerHTML")：

driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tbody.bn-table-tbody tr")))
for element in elements:
    print(element.get_attribute("textContent"))
driver.quit()

使用_XPATH_和_text_属性：

driver.get("https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8")
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tbody[@class='bn-table-tbody']//tr")))
for element in elements:
    print(element.get_attribute("textContent"))
driver.quit()

控制台输出：

SOLUSDT Perpetual Short20x703621.952020.65609,118.79 (125.4862%)2023-03-03 19:31:49Trade
ETHUSDT Perpetual Short30x385.3831,562.541,568.19-2,176.72 (-10.8052%)2023-03-05 04:03:30Trade
EOSUSDT Perpetual Short20x138526.51.2721.2078,996.67 (107.6456%)2023-03-04 05:12:13Trade
COCOSUSDT Perpetual Short10x33878.52.2631201.58500022,973.69 (427.8359%)2023-03-03 06:50:52Trade
SSVUSDT Perpetual Short10x1010.344.25224938.0900006,225.72 (161.7813%)2023-03-03 20:05:15Trade

注意：您需要添加以下导入语句：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

英文:

To get all the data in the <tr> tags within the <tbody> tag in a list object you need to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

Using CSS_SELECTOR and get_attribute("innerHTML"):

driver.get(&quot;https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8&quot;)
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, &quot;tbody.bn-table-tbody tr&quot;)))
for element in elements:
	print(element.get_attribute(&quot;textContent&quot;))
driver.quit()

Using XPATH and text attribute:

driver.get(&quot;https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8&quot;)
elements = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, &quot;//tbody[@class=&#39;bn-table-tbody&#39;]//tr&quot;)))
for element in elements:
	print(element.get_attribute(&quot;textContent&quot;))
driver.quit()

Console Output:

SOLUSDT Perpetual Short20x703621.952020.65609,118.79&#160;(125.4862%)2023-03-03 19:31:49Trade
ETHUSDT Perpetual Short30x385.3831,562.541,568.19-2,176.72&#160;(-10.8052%)2023-03-05 04:03:30Trade
EOSUSDT Perpetual Short20x138526.51.2721.2078,996.67&#160;(107.6456%)2023-03-04 05:12:13Trade
COCOSUSDT Perpetual Short10x33878.52.2631201.58500022,973.69&#160;(427.8359%)2023-03-03 06:50:52Trade
SSVUSDT Perpetual Short10x1010.344.25224938.0900006,225.72&#160;(161.7813%)2023-03-03 20:05:15Trade

Note : You have to add the following imports :

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

答案2

得分: 1

以下是代码的翻译部分：

from bs4 import BeautifulSoup
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options

url = f'https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8'

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
                  "Chrome/93.0.4577.82 Safari/537.36",
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,"
              "application/signed-exchange;v=b3;q=0.9",
}


def get_result(url, headers):
    chrome_options = Options()
    options = webdriver.ChromeOptions()
    options.add_argument('headless')
    options.add_argument('--no-sandbox')
    driver = webdriver.Chrome(options=chrome_options, executable_path=".../chromedriver_linux64/chromedriver") # 填写你的chromedriver的路径
    driver.get(url)
    time.sleep(10)
    html = driver.page_source
    soup = BeautifulSoup(html, "lxml")
    tbody = soup.find('tbody', class_='bn-table-tbody')
    trs = tbody.find_all('tr')
    data = list()
    for tr in trs:
        tr_key = tr.get('data-row-key')
        if tr_key is None:
            pass
        else:
            mid_data = list()
            count = 0
            mid_data.append(f'Symbol - {tr_key}')
            tds = tr.find_all('td')
            mid_data.append(f'td_count - {len(tds)}')
            for td in tds:
                count += 1
                mid_data.append(f'td_{count} - {td.text}')
            print(mid_data)


def main():
    get_result(url=url, headers=headers)


if __name__ == "__main__":
    main()

请注意，代码的翻译已经包括在原始内容中，没有额外的信息。

英文:

For your task, you can use selenium + BeautifulSoup. Open the page in selenium, wait for the page to load, and then use the received data as a 'soup' object. First we find 'tbody', then we search for all 'tr' and for each 'tr' we find all 'td'. We extract the data and write it to the list. The first element is 'Symbol', the second is the total number of 'td' elements in the section, and then all the data from the table. Code:

from bs4 import BeautifulSoup
from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options

url = f&#39;https://www.binance.com/en/futures-activity/leaderboard/user/um?encryptedUid=14507FCBFF9FBE584EDDEC628C4593B8&#39;

headers = {
    &quot;User-Agent&quot;: &quot;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) &quot;
                  &quot;Chrome/93.0.4577.82 Safari/537.36&quot;,
    &quot;Accept&quot;: &quot;text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,&quot;
              &quot;application/signed-exchange;v=b3;q=0.9&quot;,
}


def get_result(url, headers):
    chrome_options = Options()
    options = webdriver.ChromeOptions()
    options.add_argument(&#39;headless&#39;)
    options.add_argument(&#39;--no-sandbox&#39;)
    driver = webdriver.Chrome(options=chrome_options, executable_path=&quot;.../chromedriver_linux64/chromedriver&quot;) # the path to your chromedriver
    driver.get(url)
    time.sleep(10)
    html = driver.page_source
    soup = BeautifulSoup(html, &quot;lxml&quot;)
    tbody = soup.find(&#39;tbody&#39;, class_=&#39;bn-table-tbody&#39;)
    trs = tbody.find_all(&#39;tr&#39;)
    data = list()
    for tr in trs:
        tr_key=tr.get(&#39;data-row-key&#39;)
        if tr_key is None:
            pass
        else:
            mid_data = list()
            count=0
            mid_data.append(f&#39;Symbol - {tr_key}&#39;)
            tds = tr.find_all(&#39;td&#39;)
            mid_data.append(f&#39;td_count - {len(tds)}&#39;)
            for td in tds:
                count+=1
                mid_data.append(f&#39;td_{count} - {td.text}&#39;)
            print(mid_data)
    

def main():
    get_result(url=url, headers=headers)


if __name__ == &quot;__main__&quot;:
    main()

Will return:

[&#39;Symbol - SOLUSDT&#39;, &#39;td_count - 7&#39;, &#39;td_1 - SOLUSDT Perpetual Short20x&#39;, &#39;td_2 - 7036&#39;, &#39;td_3 - 21.9520&#39;, &#39;td_4 - 20.4050&#39;, &#39;td_5 - 10,884.64\xa0(151.6288%)&#39;, &#39;td_6 - 2023-03-03 17:01:49&#39;, &#39;td_7 - Trade&#39;]
[&#39;Symbol - ETHUSDT&#39;, &#39;td_count - 7&#39;, &#39;td_1 - ETHUSDT Perpetual Short30x&#39;, &#39;td_2 - 385.383&#39;, &#39;td_3 - 1,562.54&#39;, &#39;td_4 - 1,564.50&#39;, &#39;td_5 - -754.66\xa0(-3.7549%)&#39;, &#39;td_6 - 2023-03-05 01:33:30&#39;, &#39;td_7 - Trade&#39;]
[&#39;Symbol - EOSUSDT&#39;, &#39;td_count - 7&#39;, &#39;td_1 - EOSUSDT Perpetual Short20x&#39;, &#39;td_2 - 138526.5&#39;, &#39;td_3 - 1.272&#39;, &#39;td_4 - 1.175&#39;, &#39;td_5 - 13,383.85\xa0(164.4547%)&#39;, &#39;td_6 - 2023-03-04 02:42:13&#39;, &#39;td_7 - Trade&#39;]
[&#39;Symbol - COCOSUSDT&#39;, &#39;td_count - 7&#39;, &#39;td_1 - COCOSUSDT Perpetual Short10x&#39;, &#39;td_2 - 33878.5&#39;, &#39;td_3 - 2.263120&#39;, &#39;td_4 - 1.534000&#39;, &#39;td_5 - 24,701.49\xa0(475.3063%)&#39;, &#39;td_6 - 2023-03-03 04:20:52&#39;, &#39;td_7 - Trade&#39;]
[&#39;Symbol - SSVUSDT&#39;, &#39;td_count - 7&#39;, &#39;td_1 - SSVUSDT Perpetual Short10x&#39;, &#39;td_2 - 1010.3&#39;, &#39;td_3 - 44.252249&#39;, &#39;td_4 - 38.808423&#39;, &#39;td_5 - 5,499.90\xa0(140.2743%)&#39;, &#39;td_6 - 2023-03-03 17:35:15&#39;, &#39;td_7 - Trade&#39;]

You can process the final data as you like.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用Python Selenium捕获binance.com网站中所有位于tr标签内的数据。

问题

答案1

答案2

如何使用Selenium选择具有动态ID和名称值的下拉选项。

因为在Selenium网络自动化中元素无法交互，所以无法发送键。

Windows Python Selenium chrome binary issue? No chrome binary at C:\Program Files\Google\Chrome\Application\chrome.exe

Selenium，无法获取产品的所有价格和日期信息的问题

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论