2023年2月19日 09:08:44go评论70阅读模式

英文:

How to scrape yahoo finance news headers with BeautifulSoup?

问题

I would like to scrape news from yahoo's finance, for a pair.

How does bs4's find() or find_all() work?

for this example:

如何使用BeautifulSoup抓取Yahoo Finance新闻标题？

with this link:

I'm trying to extract the data ... but no data is scraped. why? what's wrong?

I'm using this, but nothing is printed (except the tickers)

news_table_s = html.find_all("div", {"class": "Py(14px) Pos(r)"})
news_tables_s[ticker_s] = news_table_s
print("news_tables", news_tables_s)

I would like to extract the headers from a yahoo finance web page.

英文:

I would like to scrape news from yahoo's finance, for a pair.

How does bs4's find() or find_all() work?

for this example:

如何使用BeautifulSoup抓取Yahoo Finance新闻标题？

with this link:

I'm traying to extract the data ... but no data is scraped. why? what's wrong?

I'm using this, but nothing is printed (except the tickers)

html = BeautifulSoup(source_s, &quot;html.parser&quot;)  # &quot;html&quot;)
            
            news_table_s = html.find_all(&quot;div&quot;,{&quot;class&quot;:&quot;Py(14px) Pos(r)&quot;})
            
            news_tables_s[ticker_s] = news_table_s
        print(&quot;news_tables&quot;, news_tables_s)

I would like to extract the headers from a yahoo finance web page.

答案1

得分: 0

你必须迭代你的 ResultSet 以获取其中的内容。

for e in html.find_all("div", {"class": "Py(14px) Pos(r)"}):
    print(e.h3.text)

建议 - 不要使用动态类来选择元素，使用更静态的ID或HTML结构，可以通过 css选择器 进行选择。

for e in html.select('div:has(>h3>a)'):
    print(e.h3.text)

示例

from bs4 import BeautifulSoup
import requests

url = 'https://finance.yahoo.com/quote/EURUSD%3DX?p=EURUSD%3DX'

html = BeautifulSoup(requests.get(url).text)

for e in html.select('div:has(>h3>a)'):
    print(e.h3.text)

输出

EUR/USD steadies, but bears sharpen claws as dollar feasts on Fed bets
EUR/USD Weekly Forecast – Euro Gives Up Early Gains for the Week
EUR/USD Forecast – Euro Plunges Yet Again on Friday
EUR/USD Forecast – Euro Gives Up Early Gains
EUR/USD Forecast – Euro Continues to Test the Same Indicator
Dollar gains as inflation remains sticky; sterling retreats
Siemens Issues Blockchain Based Euro-Denominated Bond on Polygon Blockchain
EUR/USD Forecast – Euro Rallies
FOREX-Dollar slips with inflation in focus; euro, sterling up on jobs data
FOREX-Jobs figures send euro, sterling higher; dollar slips before CPI

英文:

You have to iterate your ResultSet to get anything out.

for e in html.find_all(&quot;div&quot;,{&quot;class&quot;:&quot;Py(14px) Pos(r)&quot;}):
    print(e.h3.text)

Recommendation - Do not use dynamic classes to select elements use more static ids or HTML structure, here selected via css selector

for e in html.select(&#39; div:has(&gt;h3&gt;a)&#39;):
        print(e.h3.text)

Example

from bs4 import BeautifulSoup
import requests

url=&#39;https://finance.yahoo.com/quote/EURUSD%3DX?p=EURUSD%3DX&#39;

html = BeautifulSoup(requests.get(url).text)
            
for e in html.select(&#39; div:has(&gt;h3&gt;a)&#39;):
    print(e.h3.text)

Output

EUR/USD steadies, but bears sharpen claws as dollar feasts on Fed bets
EUR/USD Weekly Forecast – Euro Gives Up Early Gains for the Week
EUR/USD Forecast – Euro Plunges Yet Again on Friday
EUR/USD Forecast – Euro Gives Up Early Gains
EUR/USD Forecast – Euro Continues to Test the Same Indicator
Dollar gains as inflation remains sticky; sterling retreats
Siemens Issues Blockchain Based Euro-Denominated Bond on Polygon Blockchain
EUR/USD Forecast – Euro Rallies
FOREX-Dollar slips with inflation in focus; euro, sterling up on jobs data
FOREX-Jobs figures send euro, sterling higher; dollar slips before CPI

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用BeautifulSoup抓取Yahoo Finance新闻标题？

问题

答案1

示例

输出

Example

Output

Scrape query返回美国谷歌搜索结果而不是英国搜索结果。

无法解决 ValueError: 无法创建数据集（名称已存在）

Trying to create a streamlit app that uses user-provided URLs to scrape and return a downloadable df

如何计算滚动窗口中的最大出现次数？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论