2023年8月9日 04:29:52go评论146阅读模式

英文:

How to find data with specific tags in Python using requests module?

问题

这是我需要从中获取内容的当前HTML部分。

<table class="table table-hover sortable-theme-minimal table-heatmap" data-sortable="">
 <thead>
  <tr>
   <th>
   </th>
   <th>
    Price
   </th>
   <th>
   </th>
   <th>
   </th>
   <th data-heatmap="1" data-heatmap-limit="5" style="text-align: center;cursor:pointer;">
    Day
   </th>
   <th data-heatmap="1" data-heatmap-limit="20" style="text-align: center;cursor:pointer">
    Month
   </th>
   <th data-heatmap="1" data-heatmap-limit="100" style="text-align: center;cursor:pointer">
    Year
   </th>
   <th class="hidden-xs" style="text-align: center;">
    Date
   </th>
  </tr>
 </thead>
 <tr data-decimals="3" data-subscribe="CL1:COM" data-symbol="CL1:COM">
  <td>
   <a href="/commodity/crude-oil">
    Crude Oil
   </a>
  </td>
  <td id="p">
   82.86

我需要获取那个82.86的数字，但是我似乎无法指定到tr data-decimals="3" data-subscribe="CL1:COM" data-symbol="CL1:COM"这一行。

以下是我当前的代码：

URL = "https://tradingeconomics.com/commodity/crude-oil"
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, "html.parser")
results = soup.find("table", class_="table table-hover sortable-theme-minimal table-heatmap")
results1 = results.find("tr", text="data-symbol=\"CL1:COM\"")
print(results.prettify())

有没有办法指定"data-symbol"类并从其下的id="p"的td中获取数据？

英文:

This is the current HTML part that I need to get something from.

&lt;table class=&quot;table table-hover sortable-theme-minimal table-heatmap&quot; data-sortable=&quot;&quot;&gt;
 &lt;thead&gt;
  &lt;tr&gt;
   &lt;th&gt;
   &lt;/th&gt;
   &lt;th&gt;
    Price
   &lt;/th&gt;
   &lt;th&gt;
   &lt;/th&gt;
   &lt;th&gt;
   &lt;/th&gt;
   &lt;th data-heatmap=&quot;1&quot; data-heatmap-limit=&quot;5&quot; style=&quot;text-align: center;cursor:pointer;&quot;&gt;
    Day
   &lt;/th&gt;
   &lt;th data-heatmap=&quot;1&quot; data-heatmap-limit=&quot;20&quot; style=&quot;text-align: center;cursor:pointer&quot;&gt;
    Month
   &lt;/th&gt;
   &lt;th data-heatmap=&quot;1&quot; data-heatmap-limit=&quot;100&quot; style=&quot;text-align: center;cursor:pointer&quot;&gt;
    Year
   &lt;/th&gt;
   &lt;th class=&quot;hidden-xs&quot; style=&quot;text-align: center;&quot;&gt;
    Date
   &lt;/th&gt;
  &lt;/tr&gt;
 &lt;/thead&gt;
 &lt;tr data-decimals=&quot;3&quot; data-subscribe=&quot;CL1:COM&quot; data-symbol=&quot;CL1:COM&quot;&gt;
  &lt;td&gt;
   &lt;a href=&quot;/commodity/crude-oil&quot;&gt;
    Crude Oil
   &lt;/a&gt;
  &lt;/td&gt;
  &lt;td id=&quot;p&quot;&gt;
   82.86

I need to get that 82.86 number, but I can't seem to specify to the "tr data-decimals="3" data-subscribe="CL1:COM" data-symbol="CL1:COM"" line.

Here is my current code:

URL = &quot;https://tradingeconomics.com/commodity/crude-oil&quot;
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, &quot;html.parser&quot;)
results = soup.find(&quot;table&quot;, class_=&quot;table table-hover sortable-theme-minimal table-heatmap&quot;)
results1 = results.find(&quot;tr&quot;, text=&quot;data-symbol=\&quot;CL1:COM\&quot;&quot;)
print(results.prettify())

Is there a way to specify the "data-symbol" class and get the data from the id="p" td under it?

答案1

得分: 1

你可以使用Beautiful Soup，尝试以下代码：

from bs4 import BeautifulSoup
import requests
URL = "https://tradingeconomics.com/commodity/crude-oil"
page = requests.get(URL)
soup = BeautifulSoup(page.content, "html.parser")
target_tr = soup.find("tr", attrs={"data-symbol": "CL1:COM"})
price_td = target_tr.find("td", id="p")
price = price_td.get_text(strip=True)
print("价格：", price)

另外，Selenium也可能会有用。祝好运！

英文:

You can use Beautiful Soup, try this:

from bs4 import BeautifulSoup
import requests
URL = &quot;https://tradingeconomics.com/commodity/crude-oil&quot;
page = requests.get(URL)
soup = BeautifulSoup(page.content, &quot;html.parser&quot;)
target_tr = soup.find(&quot;tr&quot;, attrs={&quot;data-symbol&quot;: &quot;CL1:COM&quot;})
price_td = target_tr.find(&quot;td&quot;, id=&quot;p&quot;)
price = price_td.get_text(strip=True)
print(&quot;Price:&quot;, price)

Also Selenium could prove useful..gl

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何使用Python的requests模块查找具有特定标签的数据？

问题

答案1

使用UDF筛选Spark DataFrame。

Python自动售货机程序- 我有两个问题

如何在Django模板中从视图中打印HTML内容？

我想在选取物品被按下时实现on_touch_down函数，但不知道如何实现它。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。