英文:
Website not returning data that I want using beautifulsoup, but it shows up fine in my browser
问题
我尝试从这个网站抓取一些数据,但出现了403错误。当我在浏览器中打开它时,没有出现错误。帮助将不胜感激。这是我第一次尝试进行网络抓取。我认为我需要在标头中做一些不同的事情?不太确定。谢谢
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}
url = 'https://api.prizepicks.com/projections'
r = requests.get(url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)
我收到403错误,而且没有返回我想要的数据。
英文:
I'm trying to scrape some data from this website but getting a 403 error. When I open it in my browser its not giving me the error. Help would be appreciated. This is my first time trying any web scraping. I think I need something different in my header? not sure. thanks
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}
url = 'https://api.prizepicks.com/projections'
r = requests.get(url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)
I get a 403 error and its not returning the data I want.
答案1
得分: 0
以下是翻译好的代码部分:
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}
r = requests.get(pp_props_url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)
英文:
The following code should work:
import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}
r = requests.get(pp_props_url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论