英文:
Python web-scraping issue: unable to retrieve body section data from a specific URL
问题
以下是翻译好的部分:
"Unable to retrieve data from the body section while attempting web scraping using Python."
我尝试使用Python进行网络抓取时无法从正文部分检索数据。
"I am facing an issue where I am unable to retrieve data from the body section while performing web scraping using Python. I would appreciate some assistance with this problem."
我遇到了一个问题,无法在使用Python进行网络抓取时从正文部分检索数据。我会感激一些关于这个问题的帮助。
"Python Code:"
Python 代码:
import requests
from bs4 import BeautifulSoup
url_kakao = "https://www.kakaopay.com/news/pr"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/**********"}
# 用户代理信息可能被视为机密,因此已被替换为" * "
res_kakao = requests.get(url_kakao, headers=headers)
res_kakao.raise_for_status()
soup_kakao = BeautifulSoup(res_kakao.text,'lxml')
kakao = soup_kakao.find_all("div",attrs={"class":"css-1mqcdgs e2lpi48"})
print(kakao)
→ 结论:无
输出为"NONE"的原因可能是无法成功抓取正文部分的数据。
在url_kakao网站上无法进行网络抓取吗?
英文:
Unable to retrieve data from the body section while attempting web scraping using Python.
I am facing an issue where I am unable to retrieve data from the body section while performing web scraping using Python. I would appreciate some assistance with this problem.
Python Code:
import requests
from bs4 import BeautifulSoup
url_kakao = "https://www.kakaopay.com/news/pr"
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/**********"}
# The user-agent information may be considered confidential, so it has been replaced with "*"
res_kakao = requests.get(url_kakao, headers=headers)
res_kakao.raise_for_status()
soup_kakao = BeautifulSoup(res_kakao.text,'lxml')
kakao = soup_kakao.find_all("div",attrs={"class":"css-1mqcdgs e2lpi48"})
print(kakao)
→ conclusion : NONE
The reason for the output being 'NONE' is likely because the data from the body section is not being successfully scraped.
Is web scraping not possible on the url_kakao website?
答案1
得分: 0
以下是您要翻译的内容:
使用JavaScript通过API动态加载内容:
import requests
requests.get('https://www.kakaopay.com/brand-api/news?page=0&size=10&locale=ko').json()
只需适应请求并调整page
参数的值。
{'list': [{'id': 278,
'news_contents_category': 'COMMON',
'title': '5월, 카카오페이로 편의점 결제하면 무제한 혜택이!',
'present_dttm': '2023. 5. 17.'},
{'id': 277,
'news_contents_category': 'COMMON',
'title': '카카오페이, "3년 내 연 100억 건의 금융 니즈 해결 목표"',
'present_dttm': '2023. 5. 15.'},
{'id': 276,
'news_contents_category': 'COMMON',
'title': '카카오페이증권, ‘매일 이자 받기’ 서비스 시작',
'present_dttm': '2023. 5. 4.'},...]}
使用JSON中的id
,您可以访问文章:https://www.kakaopay.com/news/pr_detail?id=278
英文:
Content is loaded dynamically via JavaScript from an api:
import requests
requests.get('https://www.kakaopay.com/brand-api/news?page=0&size=10&locale=ko').json()
Simply adapt the request and adjust the the value for the page
parameter.
{'list': [{'id': 278,
'news_contents_category': 'COMMON',
'title': '5월, 카카오페이로 편의점 결제하면 무제한 혜택이!',
'present_dttm': '2023. 5. 17.'},
{'id': 277,
'news_contents_category': 'COMMON',
'title': '카카오페이, "3년 내 연 100억 건의 금융 니즈 해결 목표"',
'present_dttm': '2023. 5. 15.'},
{'id': 276,
'news_contents_category': 'COMMON',
'title': '카카오페이증권, ‘매일 이자 받기’ 서비스 시작',
'present_dttm': '2023. 5. 4.'},...]}
With the id
from the JSON you could call the articles https://www.kakaopay.com/news/pr_detail?id=278
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论