如何在点击按钮时获取出现的数据?

huangapple go评论108阅读模式
英文:

How to scrape data that appear when click on button?

问题

我正在尝试从网站上提取电话号码,但只有在我点击第一个号码时才会出现。换句话说,电话号码将被隐藏在HTML代码中,当我点击时才会显示。你能帮忙吗?
我使用了以下代码:

  1. import requests
  2. from bs4 import BeautifulSoup
  3. url = "https://hipages.com.au/connect/makermanservices"
  4. req = requests.get(url).text
  5. soup = BeautifulSoup(req, "html.parser")
  6. phone = soup.find('a', class_='PhoneNumber__MobileOnly-sc-4ewwun-1 izNnbI phone-number__mobile')
  7. print(phone)
英文:

I am trying to scrape phone numbers from website, but the numbers will appear only if I click on the first number. In other words, the phone will be hidden in the HTML code, and when I click it will appear. can you help please?
I used the following code:

  1. import requests
  2. from bs4 import BeautifulSoup
  3. url = "https://hipages.com.au/connect/makermanservices"
  4. req = requests.get(url).text
  5. soup = BeautifulSoup(req,"html.parser")
  6. phone = soup.find('a', class_='PhoneNumber__MobileOnly-sc-4ewwun-1 izNnbI phone-number__mobile')
  7. print(phone)

答案1

得分: 1

通过一些黑客技巧,你可以使用 bs4pandas 获取电话号码。

例如:

  1. import json
  2. import re
  3. import pandas as pd
  4. import requests
  5. from bs4 import BeautifulSoup
  6. url = "https://hipages.com.au/connect/makermanservices"
  7. script_text = "window.__INITIAL_PROPS__="
  8. headers = {
  9. "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36 Edg/112.0.1722.48",
  10. }
  11. soup = BeautifulSoup(requests.get(url, headers=headers).content, "lxml")
  12. script = soup.find("script", string=lambda t: t and script_text in t)
  13. data = json.loads(re.search(script_text + r"(.+)", script.string).group(1))
  14. df = (
  15. pd.read_json(data)
  16. ["fetchKey-7-0-0_/connect/makermanservices"]
  17. ["site"]
  18. ["primary_location"]
  19. ["phone"]
  20. )
  21. print(df)

这应该打印出:

  1. 1800 801 828
英文:

With a little bit of hacking, you can get the phone number with the help of bs4 and pandas.

For example:

  1. import json
  2. import re
  3. import pandas as pd
  4. import requests
  5. from bs4 import BeautifulSoup
  6. url = "https://hipages.com.au/connect/makermanservices"
  7. script_text = "window.__INITIAL_PROPS__="
  8. headers = {
  9. "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36 Edg/112.0.1722.48",
  10. }
  11. soup = BeautifulSoup(requests.get(url, headers=headers).content, "lxml")
  12. script = soup.find("script", string=lambda t: t and script_text in t)
  13. data = json.loads(re.search(script_text + r"(.+)", script.string).group(1))
  14. df = (
  15. pd.read_json(data)
  16. ["fetchKey-7-0-0_/connect/makermanservices"]
  17. ["site"]
  18. ["primary_location"]
  19. ["phone"]
  20. )
  21. print(df)

This should print:

  1. 1800 801 828

huangapple
  • 本文由 发表于 2023年4月19日 21:50:00
  • 转载请务必保留本文链接:https://go.coder-hub.com/76055331.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定