英文:
reading xlsx to Pandas from Google Drive without downloading it?
问题
我正在尝试将xlsx文件读取为pandas df,而无需下载它,但我遇到了问题。
我尝试使用read_excel读取URL并解析sheet_id,但遇到了阻止我的以下错误。
错误:
zipfile.BadZipFile: 文件不是zip文件
我有一个服务账户作为身份验证(应该足够),但不希望使用像这样的bearer token作为解决方案like这建议的那样。
任何帮助将不胜感激。
英文:
I am trying to read an xlsx file into a pandas df without downloading it but I am having issues.
I have tried read_excel while parsing the URL with sheet_id but encountered the following error that blocks me.
url = f"https://docs.google.com/spreadsheets/d/{sheet_id}/export"
file_df = pd.read_excel(url, engine='openpyxl')
error:
zipfile.BadZipFile: File is not a zip file
I have a service account as authentication(should be enough) and can not wish to use a bearer token as a solution like this suggests.
Any help would be much appreciated.
答案1
得分: 1
由于您的电子表格托管在Google上,您应该将其导出为所需格式。
要么是csv:
df = pd.read_csv(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=csv")
要么是xlsx:
df = pd.read_excel(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=xlsx")
英文:
Since your spreadsheet is hosted on google, you should export it to the desired format.
Either csv:
df = pd.read_csv(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=csv")
Or xlsx:
df = pd.read_excel(f"https://docs.google.com/spreadsheets/export?id={sheet_id}&format=xlsx")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论