英文:
How to scrape data that appear when click on button?
问题
我正在尝试从网站上提取电话号码,但只有在我点击第一个号码时才会出现。换句话说,电话号码将被隐藏在HTML代码中,当我点击时才会显示。你能帮忙吗?
我使用了以下代码:
import requests
from bs4 import BeautifulSoup
url = "https://hipages.com.au/connect/makermanservices"
req = requests.get(url).text
soup = BeautifulSoup(req, "html.parser")
phone = soup.find('a', class_='PhoneNumber__MobileOnly-sc-4ewwun-1 izNnbI phone-number__mobile')
print(phone)
英文:
I am trying to scrape phone numbers from website, but the numbers will appear only if I click on the first number. In other words, the phone will be hidden in the HTML code, and when I click it will appear. can you help please?
I used the following code:
import requests
from bs4 import BeautifulSoup
url = "https://hipages.com.au/connect/makermanservices"
req = requests.get(url).text
soup = BeautifulSoup(req,"html.parser")
phone = soup.find('a', class_='PhoneNumber__MobileOnly-sc-4ewwun-1 izNnbI phone-number__mobile')
print(phone)
答案1
得分: 1
通过一些黑客技巧,你可以使用 bs4
和 pandas
获取电话号码。
例如:
import json
import re
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "https://hipages.com.au/connect/makermanservices"
script_text = "window.__INITIAL_PROPS__="
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36 Edg/112.0.1722.48",
}
soup = BeautifulSoup(requests.get(url, headers=headers).content, "lxml")
script = soup.find("script", string=lambda t: t and script_text in t)
data = json.loads(re.search(script_text + r"(.+)", script.string).group(1))
df = (
pd.read_json(data)
["fetchKey-7-0-0_/connect/makermanservices"]
["site"]
["primary_location"]
["phone"]
)
print(df)
这应该打印出:
1800 801 828
英文:
With a little bit of hacking, you can get the phone number with the help of bs4
and pandas
.
For example:
import json
import re
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "https://hipages.com.au/connect/makermanservices"
script_text = "window.__INITIAL_PROPS__="
headers = {
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/112.0.0.0 Safari/537.36 Edg/112.0.1722.48",
}
soup = BeautifulSoup(requests.get(url, headers=headers).content, "lxml")
script = soup.find("script", string=lambda t: t and script_text in t)
data = json.loads(re.search(script_text + r"(.+)", script.string).group(1))
df = (
pd.read_json(data)
["fetchKey-7-0-0_/connect/makermanservices"]
["site"]
["primary_location"]
["phone"]
)
print(df)
This should print:
1800 801 828
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论