如何查找 HTML 表格的 ID。

huangapple go评论149阅读模式
英文:

How to find html-table ID

问题

我想解析一个表格并通过Jsoup-Java下载它。我知道我可以使用getElementById函数来实现这个目的。我的问题是:我如何在网站的HTML代码中找到那个id?
作为示例,我将在 Wikipedia文章中提供第一个表格。

英文:

I'd like to parse over a table and download it via Jsoup-Java. I know that I can use the function getElementById for that purpose. My problem is now: How can I find that id in the html-code of a website?
As an example, I will give the first table in this wikipedia-article.

答案1

得分: 0

以下是翻译好的部分:

也许这个 Python 脚本可以帮助您下载网站的源代码:

  1. from urllib.request import urlopen
  2. html = urlopen("https://support.image-line.com/member/profile.php?module=Unlock").read()
  3. f = open("source.html", 'wb')
  4. f.write(html)
  5. f.close()

然后,您可以使用 Python 对文件内容进行修整,从而删除 <tbody> 标签之前和之后的内容。

示例:

  1. with open("source.html", "r") as f:
  2. content = f.read()
  3. position = content.find("<tbody>")
  4. content = content[position:]
  5. split_string = content.split("</tbody>", 1)
  6. substring = split_string[0]
  7. with open("table.html", "w") as out:
  8. out.write(substring)
  9. out.close()
  10. f.close()

现在您将会得到一个名为 "table.html" 的文件,其中包含表格内容。

英文:

Maybe this python script will help you to download the source code of a website:

  1. from urllib.request import urlopen
  2. html = urlopen("https://support.image-line.com/member/profile.php?module=Unlock").read()
  3. f = open("source.html", 'wb')
  4. f.write(html)
  5. f.close()

Then you trim the file contents using also python, so you delete contents before the <tbody> tag and after closing it.

Example:<br>

  1. with open(&quot;source.html&quot;, &quot;r&quot;) as f:
  2. content = f.read()
  3. position = content.find(&quot;&lt;tbody&gt;&quot;)
  4. content = content[position:]
  5. split_string = content.split(&quot;&lt;/tbody&gt;&quot;, 1)
  6. substring = split_string[0]
  7. with open(&quot;table.html&quot;, &quot;w&quot;) as out:
  8. out.write(substring)
  9. out.close()
  10. f.close()

Now you will get a file named "table.html", that contains the table.

huangapple
  • 本文由 发表于 2020年4月6日 17:58:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/61057191.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定