英文:
Using BeautifulSoup for finding scripts
问题
url = 'https://understat.com/team/{}/2022'.format('Brentford')
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
scripts = soup.find_all('script')
scripts
英文:
The code below was working as of a few days ago, but now it is only finding the first script for the url
url = 'https://understat.com/team/{}/2022'.format('Brentford')
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
scripts = soup.find_all('script')
scripts
答案1
得分: 1
尝试在访问页面时设置一个 cookie:
```py
import requests
from bs4 import BeautifulSoup
url = "https://understat.com/team/{}/2022".format("Brentford")
response = requests.get(url, cookies={"beget": "begetok"}) # <-- 注意 cookies= 参数
soup = BeautifulSoup(response.content, "html.parser")
scripts = soup.find_all("script")
print(scripts)
打印:
...
window.onload = function() { (adsbygoogle = window.adsbygoogle || []).push({}); }
</script>, <script defer="" src="js/date.format.min.js?v=2" type="text/javascript"></script>, <script defer="" src="js/calendar.js?v=2.1" type="text/javascript"></script>, <script defer="" src="js/team.js?v=2.5" type="text/javascript"></script>]
英文:
Try to set a cookie when accessing the page:
import requests
from bs4 import BeautifulSoup
url = "https://understat.com/team/{}/2022".format("Brentford")
response = requests.get(url, cookies={"beget": "begetok"}) # <-- note the cookies= parameter
soup = BeautifulSoup(response.content, "html.parser")
scripts = soup.find_all("script")
print(scripts)
Prints:
...
window.onload = function() { (adsbygoogle = window.adsbygoogle || []).push({}); }
</script>, <script defer="" src="js/date.format.min.js?v=2" type="text/javascript"></script>, <script defer="" src="js/calendar.js?v=2.1" type="text/javascript"></script>, <script defer="" src="js/team.js?v=2.5" type="text/javascript"></script>]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论