英文:
Extract data from an HTML table on a Web page into Excel
问题
我需要一些关于使用VBA从网站获取数据的帮助。我在Excel中有一张包含ETF代码、链接和价格的表格,我试图使用VBA从每个链接中获取昨天的收盘价,但问题是我不确定在这个位置应该写什么"HTMLDoc.getElementById(???)"。我无法在该网站的HTML代码中找到任何ID,希望你可以帮助我。
网站示例:https://www.boerse-frankfurt.de/en/etf/amundi-prime-global-ucits-etf-dr-c/price-history/historical-prices-and-volumes
我的代码:
Sub GetETFPrices()
Dim IE As New InternetExplorer
Dim HTMLDoc As HTMLDocument
Dim ETFLink As String
Dim ETFPrice As String
Dim i As Long
' 遍历表格中的每一行
For i = 2 To ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
ETFLink = ThisWorkbook.Worksheets("Sheet1").Cells(i, 2).Value ' 从第2列获取链接
' 在Internet Explorer中打开链接
IE.Navigate ETFLink
Do While IE.Busy
DoEvents
Loop
' 从HTML文档中获取ETF价格元素
Set HTMLDoc = IE.document
Set ETFRow = HTMLDoc.getElementById("???").Rows(HTMLDoc.getElementById("???").Rows.Length - 1)
ETFPrice = ETFRow.Cells(3).innerText
' 更新Excel表格中的价格列
ThisWorkbook.Worksheets("Sheet1").Cells(i, 3).Value = ETFPrice
Next i
IE.Quit
End Sub
英文:
I need some help with taking data from website using vba. I have table in excel with etf ticker, link and price and using vba I try to take close price of yesterday from each link but the problem is that I am not sure what to write at this place "HTMLDoc.getElementById("???")". I can't find any ID from that website html code, hope you can help me with that.
website example: https://www.boerse-frankfurt.de/en/etf/amundi-prime-global-ucits-etf-dr-c/price-history/historical-prices-and-volumes
my code:
Sub GetETFPrices()
Dim IE As New InternetExplorer
Dim HTMLDoc As HTMLDocument
Dim ETFLink As String
Dim ETFPrice As String
Dim i As Long
' Loop through each row in the table
For i = 2 To ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).row
ETFLink = ThisWorkbook.Worksheets("Sheet1").Cells(i, 2).Value ' Get link from column 2
' Opening the link in Internet Explorer
IE.Navigate ETFLink
Do While IE.Busy
DoEvents
Loop
' Get the ETF price element from the HTML document
Set HTMLDoc = IE.document
Set ETFRow = HTMLDoc.getElementById("???").Rows(HTMLDoc.getElementById("???").Rows.Length - 1)
ETFPrice = ETFRow.Cells(3).innerText
' Updating the price column in Excel sheet
ThisWorkbook.Worksheets("Sheet1").Cells(i, 3).Value = ETFPrice
Next i
IE.Quit
End Sub
答案1
得分: 1
如果您不必使用Internet Explorer,我建议使用Selenium,它与Chrome(和FireFox)兼容。
现在,看一下您提供的链接页面,这是一个情况,您的数据位于表格中,并且没有id
属性可用于定位特定单元格。因此,我建议将整个表格从页面导入Excel中,然后您可以更容易地导航以查找您感兴趣的数据点。
您可以使用以下方法:
Sub ImportHtmlTable()
'初始化Selenium
Dim bot As WebDriver
Set bot = New WebDriver
bot.Start "chrome", "YourUrl"
bot.Get "/"
DoEvents
Dim Tables As WebElements
Set Tables = bot.FindElementsByTag("table")
Dim wb As Workbook
Set wb = Workbooks("YourWorkbookName")
HtmlTablesToRange Tables, wb.Sheets(1).Range("A1")
bot.Close
Set bot = Nothing
End Sub
'受以下代码启发:https://www.vba-market.com/
Sub HtmlTablesToRange(Tables As WebElements, Destination As Range)
Destination.CurrentRegion.ClearContents
Dim tb As WebElement
Dim ths As WebElements '表头 (th)
Dim trs As WebElements '行 (tr)
Dim tds As WebElements '数据单元格 (td)
For Each tb In Tables
Dim theads As WebElements
Set theads = tb.FindElementsByTag("thead")
Dim thead As WebElement
For Each thead In theads
Set trs = thead.FindElementsByTag("tr")
Dim tr As WebElement
For Each tr In trs
Set ths = tr.FindElementsByTag("th")
Dim y As Long, z As Long
y = 0 ' 重置到第一列
Dim th As WebElement
For Each th In ths
Destination.Offset(z, y).Value = th.text
y = y + 1
Next th
z = z + 1
Next tr
Next thead
Dim tbodys As WebElements
Set tbodys = tb.FindElementsByTag("tbody")
Dim tbody As WebElement
For Each tbody In tbodys
Set trs = tbody.FindElementsByTag("tr")
For Each tr In trs
Set tds = tr.FindElementsByTag("td")
y = 0 ' 重置到第一列
Dim td As WebElement
For Each td In tds
Destination.Offset(z, y).Value = td.text
y = y + 1
Next td
z = z + 1
Next tr
Next tbody
z = z + 1
Next tb
End Sub
如果需要提高性能,您还可以在代码执行期间关闭Application.ScreenUpdating
。
请注意,您可能需要更新Chrome驱动程序,通常位于C:\Users\YourUserName\AppData\Local\SeleniumBasic
免责声明:请确保您被允许从您感兴趣的网站收集数据。
英文:
If you don't have to use Internet Explorer, I would suggest to use Selenium which is compatible with Chrome (and FireFox).
Now, looking at the page you linked, this is the case where your data is inside a table and there is no id
attribute that will allow you to target a specific cell. Because of that, I would suggest to import the whole table from the page in Excel and then you can more easily navigate to find the data point you are interested in.
You could use this approach for instance:
Sub ImportHtmlTable()
'Initialize Selenium
Dim bot As WebDriver
Set bot = New WebDriver
bot.Start "chrome", "YourUrl"
bot.Get "/"
DoEvents
Dim Tables As WebElements
Set Tables = bot.FindElementsByTag("table")
Dim wb As Workbook
Set wb = Workbooks("YourWorkbookName")
HtmlTablesToRange Tables, wb.Sheets(1).Range("A1")
bot.Close
Set bot = Nothing
End Sub
'Inspired by code from: https://www.vba-market.com/
Sub HtmlTablesToRange(Tables As WebElements, Destination As Range)
Destination.CurrentRegion.ClearContents
Dim tb As WebElement
Dim ths As WebElements 'Headers (th)
Dim trs As WebElements 'Rows (tr)
Dim tds As WebElements 'Data cells (td)
For Each tb In Tables
Dim theads As WebElements
Set theads = tb.FindElementsByTag("thead")
Dim thead As WebElement
For Each thead In theads
Set trs = thead.FindElementsByTag("tr")
Dim tr As WebElement
For Each tr In trs
Set ths = tr.FindElementsByTag("th")
Dim y As Long, z As Long
y = 0 ' Resets back to first column
Dim th As WebElement
For Each th In ths
Destination.Offset(z, y).Value = th.text
y = y + 1
Next th
z = z + 1
Next tr
Next thead
Dim tbodys As WebElements
Set tbodys = tb.FindElementsByTag("tbody")
Dim tbody As WebElement
For Each tbody In tbodys
Set trs = tbody.FindElementsByTag("tr")
For Each tr In trs
Set tds = tr.FindElementsByTag("td")
y = 0 ' Resets back to first column
Dim td As WebElement
For Each td In tds
Destination.Offset(z, y).Value = td.text
y = y + 1
Next td
z = z + 1
Next tr
Next tbody
z = z + 1
Next tb
End Sub
If you need to improve performance, you can also turn off Application.ScreenUpdating
during the execution of the code.
Note that you might need to update the Chrome driver usually located in C:\Users\YourUserName\AppData\Local\SeleniumBasic
Disclaimer: Always make sure that you are allowed to gather data from the website you are interested in.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论