英文:
How to Load the Earnings Calendar data from TradingView link and into Dataframe
问题
I want to load the Earnings Calendar data from TradingView link and load into Dataframe.
Link: https://in.tradingview.com/markets/stocks-india/earnings/
Filter-1: Data for "This Week"
I am not able to select the Tab "This Week". Any help?
Answer is Closed so posting here:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
pd.set_option('display.max_rows', 50000)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 10000)
driver = webdriver.Chrome()
driver.get("https://in.tradingview.com/markets/stocks-india/earnings/")
driver.find_element(By.XPATH, "//div[.='This Week']").click()
time.sleep(5)
visible_columns = driver.find_elements(By.CSS_SELECTOR, 'div.tv-screener__content-pane thead th:not([class*=i-hidden])')
data_field = [c.get_attribute('data-field') for c in visible_columns]
header = [c.text.split('\n')[0] for c in visible_columns]
rows = driver.find elements(By.XPATH, "//div[@class='tv-screener__content-pane']//tbody/tr")
columns = []
for field in data_field:
column = driver.find_elements(By.XPATH, f"//div[@class='tv-screener__content-pane']//tbody/tr/td[@data-field-key='{field}']")
columns.append([col.text.replace('\n',' - ') for col in column])
df = pd.DataFrame(dict(zip(header, columns)))
print(df)
driver.quit()
(Note: The code section has not been translated, as requested.)
英文:
I want to load the Earnings Calendar data from TradingView link and load into Dataframe.
Link: https://in.tradingview.com/markets/stocks-india/earnings/
Filter-1: Data for "This Week"
I am not able to select the Tab "This Week". Any help ?
Answer is Closed so posting here:
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
pd.set_option('display.max_rows', 50000)
pd.set_option('display.max_columns', 100)
pd.set_option('display.width', 10000)
driver = webdriver.Chrome()
driver.get("https://in.tradingview.com/markets/stocks-india/earnings/")
driver.find_element(By.XPATH, "//div[.='This Week']").click()
time.sleep(5)
visible_columns = driver.find_elements(By.CSS_SELECTOR, 'div.tv-screener__content-pane thead th:not([class*=i-hidden])')
data_field = [c.get_attribute('data-field') for c in visible_columns]
header = [c.text.split('\n')[0] for c in visible_columns]
rows = driver.find_elements(By.XPATH, "//div[@class='tv-screener__content-pane']//tbody/tr")
columns = []
for field in data_field:
column = driver.find_elements(By.XPATH, f"//div[@class='tv-screener__content-pane']//tbody/tr/td[@data-field-key='{field}']")
columns.append([col.text.replace('\n',' - ') for col in column])
df = pd.DataFrame(dict(zip(header, columns)))
print(df)
driver.quit()
答案1
得分: 2
我注意到有一些由类i-hidden
特征的隐藏列。所以首先,我们使用CSS选择器:not([class*=i-hidden])
只选择可见列。然后,我们获取这些列的data-field
属性,以便可以选择行中对应的值。这个属性的示例值包括:
- name
- market_cap_basic
- earnings_per_share_forecast_next_fq
- eps_surprise_fq
接下来,我们获取表格的标题和行。然后,我们循环遍历data-field
以获取每列中的所有单元格值。最后,我们从一个字典中创建一个数据框,其中标题作为键,列作为值。
visible_columns = driver.find_elements(By.CSS_SELECTOR, 'div.tv-screener__content-pane thead th:not([class*=i-hidden])')
data_field = [c.get_attribute('data-field') for c in visible_columns]
header = [c.text.split('\n')[0] for c in visible_columns]
rows = driver.find_elements(By.XPATH, "//div[@class='tv-screener__content-pane']//tbody/tr")
columns = []
for field in data_field:
column = driver.find_elements(By.XPATH, f"//div[@class='tv-screener__content-pane']//tbody/tr/td[@data-field-key='{field}']")
columns.append([col.text.replace('\n',' - ') for col in column])
pd.DataFrame(dict(zip(header, columns)))
输出:
英文:
I noticed that there are few hidden columns characterized by the class i-hidden
. So as first thing we select only the visible columns with the css selector :not([class*=i-hidden])
. Then we get the attribute data-field
of these columns, so that we can select the corresponding values in the rows. Examples of values for this attribute are
name
market_cap_basic
earnings_per_share_forecast_next_fq
eps_surprise_fq
Next we get the header of the table and the rows. Then we loop over the data-field to get all the cell values in each column. Finally we create a dataframe from a dictionary having the header as keys and the columns as values.
visible_columns = driver.find_elements(By.CSS_SELECTOR, 'div.tv-screener__content-pane thead th:not([class*=i-hidden])')
data_field = [c.get_attribute('data-field') for c in visible_columns]
header = [c.text.split('\n')[0] for c in visible_columns]
rows = driver.find_elements(By.XPATH, "//div[@class='tv-screener__content-pane']//tbody/tr")
columns = []
for field in data_field:
column = driver.find_elements(By.XPATH, f"//div[@class='tv-screener__content-pane']//tbody/tr/td[@data-field-key='{field}']")
columns.append([col.text.replace('\n',' - ') for col in column])
pd.DataFrame(dict(zip(header, columns)))
Output
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论