英文:
Querying Google Drive files by label is failing
问题
我需要一些帮助,因为我一直在苦苦思考。
我需要编写一个定期在Lambda上运行的脚本,该脚本将从Google Drive中的某些表中提取值。找到这些值的最直接方法是使用GDrive标签功能。我们已经启用了它,创建了标签,并标记了一些文件。
然后,我可以使用API浏览器使用以下查询来查询具有该标签的所有文件:
'labels/LYBX-my-label-id-bFcb' in labels
我还可以获取我的浏览器发送的内容并在Postman或Node/其他工具中本地运行它。它可以正常工作并返回预期的文件列表。
然而,这是使用我的个人帐户凭据进行的,当我们要进行"真实"操作时,当然需要使用服务帐户。因此,我们创建了一个具有服务帐户的GCP项目,我正在使用googleapiclient
Python包。我将该服务帐户的密钥存储在AWS Secret Manager中,然后提取它,并配置我的drive
资源的实例。
这一切都有效。我可以使用它来调用drive.files().get(...)
和drive.files().list(...)
,并使用各种查询检索有关文件的数据,除了我上面使用的标签查询。当我执行该查询时,我收到一个关于q
(查询)参数的400错误。
现在,我已经降到了URL本身的级别,我的Python脚本记录的确切 GET请求URL在我使用个人Bearer令牌时有效。因此,我相当肯定这实际上不是一个坏参数问题,而是Google在API设计方面做得很糟糕,返回了糟糕的错误代码。
因此,我认为这必须是一个权限问题,但我不知道需要哪些权限才能允许帐户按照GDrive标签搜索,也不知道如何将这些权限授予服务帐户。
另一个可能的线索是drive.files().listLabels(fileId="...")
对于我知道具有标签的文件似乎失败了,因此一切都指向某种权限缺失,但不清楚是哪些权限以及如何在服务帐户上设置它们。
英文:
I need some help as I'm smashing my head on a wall.
I need to write a script to run periodically on lambda that will pull values from some sheets in google drive. The most straightforward way of finding these is to use the gdrive labels feature. We've enabled it, created the label, and tagged some files.
I can then use the api explorer to query for all files with that label using this query
'labels/LYBX-my-label-id-bFcb' in labels
I can also grab what my browser sent out and run it locally in postman or node/whatever. It works and returns the expected file listings.
However that is using my personal account credentials and when doing this "for real" we need to use a service account of course. So we created a GCP project with a service account, and I'm using the googleapiclient
python package. I store the secret for that service account in aws secretmanager, fetch it, and configure my instance of the drive
resource with it.
This all works. I can use it to call drive.files().get(...)
and drive.files().list(...)
and fetch data on files using all sorts of queries except the one I use above for the label. When I do that query I get back a 400 error that complains about the q
(query) parameter.
Now I've dropped down to the level of the url itself, and the exact GET request url that my python script logs works when I use my personal bearer token. I'm pretty sure therefore that this is not in fact a bad parameter issue and that's instead just a case of google being godawful at api design and returning crappy error codes.
So I'm thinking that this has to be a permission issue, but I have no clue what permissions are required to allow an account to search by gdrive labels nor how I would go about granting those permissions to a service account.
Another possible clue is that drive.files().listLabels(fileId="...")
on a file that I know has labels seems to fail, so again all points to some sort of permission being missing but its unclear which nor how to set those up on service accounts.
答案1
得分: 1
以下是翻译好的内容:
建议
> 注意:由于我无法看到您的实际脚本,您可以将此答案视为解决项目中问题的起点或参考。希望这将解决您的问题。
我进行了自己的复制,并成功地通过用户模拟的过程,使用基于标签 ID 的查询来列出文件,使用服务帐户。这应该在凭据创建阶段添加,其中包括一个subject
参数,以允许服务帐户模拟用户*(例如超级管理员帐户或具有必要角色的任何域帐户)*以进行服务帐户委派。
测试脚本
from google.oauth2 import service_account
from googleapiclient.discovery import build
# 服务帐户 JSON 密钥文件的路径
KEY_FILE = 'sa.json'
# 从服务帐户密钥文件创建凭据并构建服务对象
credentials = service_account.Credentials.from_service_account_file(
KEY_FILE, scopes=['https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.metadata.readonly',
'https://www.googleapis.com/auth/drive.readonly'],
subject="irv@■■■■■■■■■■■■■■.■■■■");
service = build('drive', 'v3', credentials=credentials);
# 列出标签下的文件
label_id = "OTVglmjg5BxgxSevMiuLtr6VoaeDwyg66AIRNNEbbFcb";
results = service.files().list(q= f"'labels/{label_id}' in labels").execute()
results
演示
> 我创建了一个测试标签,并在我的驱动器上标记了两个文件:
> 运行测试脚本后:
参考
英文:
SUGGESTION
> Note: Since I do not have visibility of your actual script, you can consider this answer as a starting point or reference for fixing the issue in your project. Hopefully, this will resolve your problem.
I conducted my own replication and successfully listed files by using a query based on the label ID with a service account through the process of user impersonation. This should be added in the credential creation phase, where you include a subject
parameter to enable the service account to impersonate a user (such as a super admin account or any domain account with the necessary role) for service account delegation.
Test Script
from google.oauth2 import service_account
from googleapiclient.discovery import build
# Path to the service account JSON key file
KEY_FILE = 'sa.json'
# Create credentials from the service account key file & Build the service object
credentials = service_account.Credentials.from_service_account_file(
KEY_FILE, scopes=['https://www.googleapis.com/auth/drive',
'https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.metadata.readonly',
'https://www.googleapis.com/auth/drive.readonly'],
subject="irv@■■■■■■■■■■■■■■.■■■■");
service = build('drive', 'v3', credentials=credentials);
# List files under a label
label_id = "OTVglmjg5BxgxSevMiuLtr6VoaeDwyg66AIRNNEbbFcb";
results = service.files().list(q= f"'labels/{label_id}' in labels").execute()
results
Demo
> I have created a test label and tagged it with two files in my drive:
> After running the test script:
Reference
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论