英文:
Google Drive API via Service Account - Storage quota exceeded
问题
我有一个谷歌服务账户(我拥有),该账户已被授予访问其他用户的谷歌云盘账户(由其他人拥有 - 标准的15GB存储空间)的权限。他们授予了对其G-Drive账户下的一个文件夹的完全访问权限,我可以看到在使用googleapiclient python包时,该服务账户对该文件夹具有完全访问权限。
我有一个列出该文件夹中所有文件并根据来自外部来源的新数据进行覆盖的过程。以下是我看到的情况:
-
非常小的文件(几KB到几MB)让我可以一遍又一遍地覆盖它们,没有任何问题。我没有看到任何版本历史记录,这让我有点意外。
-
超过100MB的文件似乎存在达到配额的问题。无论文件是否已存在于服务器上并被覆盖,还是我将文件完全删除并尝试上传它,都会出现这个问题。奇怪的是,我不是每次都遇到这个问题,而且实际上我在Python中上传文件时没有问题,因为我使用了多部分/可恢复的上传。
-
当我遇到这个配额错误时,我仍然可以登录到谷歌UI并手动上传文件到该云盘,没有错误。
-
当我从UI中登录到谷歌云盘账户时,我可以看到“存储”使用了0字节,即使我在“我的云盘”中有大约15个较小的文件。
所以我的问题是:
-
服务账户是否有传输配额?我确切知道云盘没有满,但在处理较大文件时仍然会遇到配额限制。
-
“我的云盘”和“存储”之间是否有需要考虑的差异,比如模拟给定用户?
谢谢!
英文:
I have a google service account (Owned by me), which has been given access to another users Google Drive Account (Owned by someone else - Standard 15GB storage). They granted full access to a single folder under their G-Drive account, and I can see that the SP has full access to the folder when using the googleapiclient python package.
I have a process that lists all the files within that folder, and overwrites some of them based on new/incoming data from an external source. Here is what I see:
-
Files that are very small (few KB to few MB) let me overwrite them over and over again, with no issues. I do not see any version history, which I kind of expected to see.
-
Files over 100MB seem to have an issue with hitting that quota. It doesn't matter if the file is already sitting on the server and I am overwriting it, or if I just remove the file completely and try and upload it. Oddly enough, I don't hit this every time and I have no issue actually uploading the file from Python as I am using multipart/resumable.
-
When I get this quota error, I can still log into the google UI and manually upload files to that drive just fine. No errors.
-
When I log into the google drive account from a UI, I can see there are 0 bytes used for "Storage", even if I have 15 or so small-ish files in "My Drive".
So my questions are:
-
Do Service Accounts have a transfer quota? I know for sure the Drive is not full, but I still get that quota limit with bigger files.
-
Are there differences between "My Drive" and "Storage" that I need to account for, such as impersonating a given user?
Thanks!
答案1
得分: 1
这是我已经完成的部分,似乎目前正常工作:
def get_service_user(api_name, api_version, scopes):
creds = None
if os.path.exists("token.pickle"):
with open("token.pickle", "rb") as token:
creds = pickle.load(token)
if not creds or not creds.valid:
# 假设一旦我们作为真实用户完成了“授权”一次,这个文件可以移动到运行的位置,并自动更新。
# 用户需要在本地运行脚本并收集token.pickle文件,然后交给自动化此过程的人。
# 或者如果这在UI中运行,后端过程可以为给定的用户存储它。
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file("credentials.json", scopes)
# 在Databricks上无法运行本地服务器。如果用户更改,您必须在本地运行此代码,与Google进行身份验证,然后将此文件存储回DBFS(上面的位置)
# 取消下面的“creds”部分以在本地运行。
# creds = flow.run_local_server(port=0)
print("令牌无法刷新。 请在本地运行此代码并生成新的pickle文件")
exit(0)
# 保存凭据以供下次运行使用
with open("token.pickle", 'wb') as token:
pickle.dump(creds, token)
# 返回Google Drive API服务
service = build(api_name, api_version, credentials=creds)
return service
英文:
This is what I have done, which seems to be working, so far:
def get_service_user(api_name, api_version, scopes):
creds = None
if os.path.exists("token.pickle"):
with open("token.pickle", "rb") as token:
creds = pickle.load(token)
if not creds or not creds.valid:
# Assuming here that once we have done the "grant" once as a real user, this file can then be moved to where its running, and update automatically.
# User will need to run script locally and collect token.pickle file / give to person automating this thing.
# Or if this runs in a UI, the back end process can just store for a given user.
if creds and creds.expired and creds.refresh_token:
creds.refresh(Request())
else:
flow = InstalledAppFlow.from_client_secrets_file("credentials.json", scopes)
# Can't run local server on Databricks. If the user changes, you must run this locally, authenticate with google, and store this file back in DBFS (location above)
# Uncomment this blow "creds" section to run locally.
# creds = flow.run_local_server(port=0)
print("The token can't be refreshed. Please run this code locally and generate a new pickle file")
exit(0)
# Save the credentials for the next run
with open("token.pickle", 'wb') as token:
pickle.dump(creds, token)
# return Google Drive API service
service = build(api_name, api_version, credentials=creds)
return service
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论