Website not returning data that I want using beautifulsoup, but it shows up fine in my browser.

huangapple go评论63阅读模式
英文:

Website not returning data that I want using beautifulsoup, but it shows up fine in my browser

问题

我尝试从这个网站抓取一些数据,但出现了403错误。当我在浏览器中打开它时,没有出现错误。帮助将不胜感激。这是我第一次尝试进行网络抓取。我认为我需要在标头中做一些不同的事情?不太确定。谢谢

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}

url = 'https://api.prizepicks.com/projections'

r = requests.get(url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)

我收到403错误,而且没有返回我想要的数据。

英文:

I'm trying to scrape some data from this website but getting a 403 error. When I open it in my browser its not giving me the error. Help would be appreciated. This is my first time trying any web scraping. I think I need something different in my header? not sure. thanks

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}


url = 'https://api.prizepicks.com/projections'

r = requests.get(url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)

I get a 403 error and its not returning the data I want.

答案1

得分: 0

以下是翻译好的代码部分:

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}

r = requests.get(pp_props_url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)
英文:

The following code should work:

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd

pp_props_url = 'https://api.prizepicks.com/projections?league_id=7&per_page=250&single_stat=true'
headers = {
'Connection': 'keep-alive',
'Accept': 'application/json; charset=UTF-8',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36',
'Access-Control-Allow-Credentials': 'true',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Referer': 'https://app.prizepicks.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9'
}

r = requests.get(pp_props_url, headers=headers)
print(r)
df = pd.json_normalize(r.json()['data'])
print(df)

huangapple
  • 本文由 发表于 2023年2月13日 23:05:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/75437660.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定