从Google Drive读取xlsx到Pandas而不下载它?

huangapple go评论113阅读模式
英文:

reading xlsx to Pandas from Google Drive without downloading it?

问题

我正在尝试将xlsx文件读取为pandas df,而无需下载它,但我遇到了问题。

我尝试使用read_excel读取URL并解析sheet_id,但遇到了阻止我的以下错误。

错误:

zipfile.BadZipFile: 文件不是zip文件 

我有一个服务账户作为身份验证(应该足够),但不希望使用像这样的bearer token作为解决方案like这建议的那样。

任何帮助将不胜感激。

英文:

I am trying to read an xlsx file into a pandas df without downloading it but I am having issues.

I have tried read_excel while parsing the URL with sheet_id but encountered the following error that blocks me.

url = f"https://docs.google.com/spreadsheets/d/{sheet_id}/export"
file_df = pd.read_excel(url, engine='openpyxl')

error:

zipfile.BadZipFile: File is not a zip file

I have a service account as authentication(should be enough) and can not wish to use a bearer token as a solution like this suggests.

Any help would be much appreciated.

答案1

得分: 1

由于您的电子表格托管在Google上,您应该将其导出为所需格式。
要么是csv:

df = pd.read_csv(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=csv")

要么是xlsx:

df = pd.read_excel(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=xlsx")
英文:

Since your spreadsheet is hosted on google, you should export it to the desired format.
Either csv:

df = pd.read_csv(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=csv")

Or xlsx:

df = pd.read_excel(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=xlsx")

huangapple
  • 本文由 发表于 2023年6月15日 18:45:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76481721.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定