英文:
Overwrite single file in a Google Cloud Storage bucket, via Python code
问题
在Google Cloud Storage桶中用Python语言正确覆盖现有文件的方法是通过设置if_generation_match
参数。您可以将if_generation_match
参数设置为目标对象的当前生成号码,这将确保只有在生成号码匹配时才会执行覆盖操作。如果生成号码不匹配,操作将失败,因此您可以捕获异常并执行相应的操作。
以下是如何修改您的代码以实现文件覆盖:
from google.cloud import storage
bucket_name = "my-bucket"
destination_blob_name = "logs.txt"
source_file_name = "logs.txt" # accessible from this script
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
# 获取目标对象的当前生成号码
current_generation = blob.generation
try:
# 尝试上传文件,只有在生成号码匹配时才会覆盖
blob.upload_from_filename(source_file_name, if_generation_match=current_generation)
print(f"File {source_file_name} uploaded and overwritten to {destination_blob_name}.")
except Exception as e:
print(f"Error: {e}")
这段代码将首先获取目标对象的当前生成号码,然后尝试使用upload_from_filename
方法覆盖目标对象。如果生成号码不匹配,它将捕获异常并输出错误消息。这样,您可以确保只有在生成号码匹配时才会执行覆盖操作。
英文:
I have a logs.txt
file at certain location, in a Compute Engine VM Instance. I want to periodically backup (i.e. overwrite) logs.txt
in a Google Cloud Storage bucket. Since logs.txt
is the result of some preprocessing made inside a Python script, I want to also use that script to upload / copy that file, into the Google Cloud Storage bucket (therefore, the use of cp
cannot be considered an option). Both the Compute Engine VM instance, and the Cloud Storage bucket, stay at the same GCP project, so "they see each other". What I am attempting right now, based on this sample code, looks like:
from google.cloud import storage
bucket_name = "my-bucket"
destination_blob_name = "logs.txt"
source_file_name = "logs.txt" # accessible from this script
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
generation_match_precondition = 0
blob.upload_from_filename(source_file_name, if_generation_match=generation_match_precondition)
print(f"File {source_file_name} uploaded to {destination_blob_name}.")
If gs://my-bucket/logs.txt
does not exist, the script works correctly, but if I try to overwrite, I get the following error:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/google/cloud/storage/blob.py", line 2571, in upload_from_file
created_json = self._do_upload(
File "/usr/local/lib/python3.8/dist-packages/google/cloud/storage/blob.py", line 2372, in _do_upload
response = self._do_multipart_upload(
File "/usr/local/lib/python3.8/dist-packages/google/cloud/storage/blob.py", line 1907, in _do_multipart_upload
response = upload.transmit(
File "/usr/local/lib/python3.8/dist-packages/google/resumable_media/requests/upload.py", line 153, in transmit
return _request_helpers.wait_and_retry(
File "/usr/local/lib/python3.8/dist-packages/google/resumable_media/requests/_request_helpers.py", line 147, in wait_and_retry
response = func()
File "/usr/local/lib/python3.8/dist-packages/google/resumable_media/requests/upload.py", line 149, in retriable_request
self._process_response(result)
File "/usr/local/lib/python3.8/dist-packages/google/resumable_media/_upload.py", line 114, in _process_response
_helpers.require_status_code(response, (http.client.OK,), self._get_status_code)
File "/usr/local/lib/python3.8/dist-packages/google/resumable_media/_helpers.py", line 105, in require_status_code
raise common.InvalidResponse(
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 412, 'Expected one of', <HTTPStatus.OK: 200>)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/my_folder/upload_to_gcs.py", line 76, in <module>
blob.upload_from_filename(source_file_name, if_generation_match=generation_match_precondition)
File "/usr/local/lib/python3.8/dist-packages/google/cloud/storage/blob.py", line 2712, in upload_from_filename
self.upload_from_file(
File "/usr/local/lib/python3.8/dist-packages/google/cloud/storage/blob.py", line 2588, in upload_from_file
_raise_from_invalid_response(exc)
File "/usr/local/lib/python3.8/dist-packages/google/cloud/storage/blob.py", line 4455, in _raise_from_invalid_response
raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.PreconditionFailed: 412 POST https://storage.googleapis.com/upload/storage/v1/b/production-onementor-dt-data/o?uploadType=multipart&ifGenerationMatch=0: {
"error": {
"code": 412,
"message": "At least one of the pre-conditions you specified did not hold.",
"errors": [
{
"message": "At least one of the pre-conditions you specified did not hold.",
"domain": "global",
"reason": "conditionNotMet",
"locationType": "header",
"location": "If-Match"
}
]
}
}
: ('Request failed with status code', 412, 'Expected one of', <HTTPStatus.OK: 200>)
I have checked the documentation for upload_from_filename
, but it seems there is no flag to "enable overwritting".
How to properly overwrite a file existing in a Google Cloud Storage Bucket, using Python language?
答案1
得分: 4
这是由于 if_generation_match。
作为特殊情况,将值为0传递给 if_generation_match 参数会使操作仅在没有活动版本的 blob 时成功。
这就是返回消息 "您指定的先决条件之一未满足" 的含义。
您应该传递 None
或完全省略该参数。
英文:
It's because of if_generation_match
> As a special case, passing 0 as the value for if_generation_match
> makes the operation succeed only if there are no live versions of the
> blob.
This is what is meant by the return message "At least one of the pre-conditions you specified did not hold."
You should pass None
or leave out that argument altogether.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论