英文:
How to copy objects across gcp buckets preserving metadata. gsutil cp drops custom metadata keys wilth null values
问题
I need to copy data across Google Cloud Platform - Cloud Storage(GCS) buckets (source is a GCS bucket and destination is a GCS bucket)
Since I perform copy along some more operations in small batches I use the gsutil cp command from bash shell script
The exact command I use is as follows
objpaths_file has object paths as gs://source_bucket/obj1, ...
objlist=objpaths_file
cat $objlist| gsutil -m cp -I gs://target_bucket
The objects to be copied have custom metadata fields.
This way of copying objects using "gsutil cp" does copy custom metadata key values, provided the metadata key has an associated non-null value
In case a custom metadata key has a null value then the copied metadata does not have that key in the destination (the key with null value is dropped from copy)
So my questions are
- Is there any other mechanism that will allow me to programmatically copy the objects with custom metadata with all keys (regardless of the key value being NULL)?
- Is there an option to change this behavior of the gsutil cp command?
- Alternatively, I am also open to suggestions for recreating missing metadata keys with and filling those with null values programmatically in the destination bucket. Of course, this option should only add missing fields with key and null values but leave key-value pairs with valid values intact!
And another less relevant question
- Would this gsutil behavior (skipping custom metadata key if value is NULL) be expected behavior, or would it rather amount to unexpected behavior/defect? should I approach Google support seeking a fix in that case?
Thanks for your response
Yogesh
英文:
I need to copy data across Google Cloud Platform - Cloud Storage(GCS) buckets (source is a GCS bucket and destination is a GCS bucket)
Since I perform copy along some more operations in small batches I use the gsutil cp command from bash shell script
The exact command I use is as follows
# objpaths_file has object paths as gs://source_bucket/obj1, ...
objlist=objpaths_file
cat $objlist| gsutil -m cp -I gs://target_bucket
The objects to be copied have custom metadata fields.
This way of copying objects using "gsutil cp" does copy custom metadata key values, provided the metadata key has an associated non null value
In case a custom metadata key has null value then the copied metadata does not have that key in the destination (the key with null value is dropped from copy)
So my questions are
- Is there any other mechanism that will allow me to programatically copy the objects with custom metadata with all keys (regardless of the key value being NULL) ?
- Is there an option to change this behaviour of the gsutil cp command ?
- Alternatively I am also open to suggestions for recreating missing metadata keys with and filling those with null values programatically in destination bucket. Offcourse this option should only add missing fields with key and null values but leave key-value pairs with valid values intact !!
And another less relevant question
- Would this gsutil behaviour (skipping custom metadata key if value is NULL) be expected behaviour , or would it rather amount to an unexpected behaviour/ defect? should I approach google support seeking a fix in that case ?)
Thanks for your response
Yogesh
答案1
得分: 0
由于gsutil cp命令丢失了元数据,请改用gcloud storage cp命令。下面的命令将复制所有元数据字段,包括具有NULL值的键。
** gcloud storage cp gs://<source bucket>/objectname gs://<target bucket>/
**
而且gcloud storage cp允许多个参数,可以像gsutil批处理模式一样使用,所以我可以使用以下方式:
** gcloud storage cp gs://<source bucket>/object1 gs://<source bucket>/object2 ...<我尝试了1000个对象> gs://<target bucket>/
**
英文:
since gsutil cp command loses metadata
use the gcloud storage cp command instead.
Below command will copy all metadata fields including keys with NULL values
** gcloud storage cp gs://<source bucket>/objectname gs://<target bucket>/
**
and gcloud storage cp allows multiple arguments to behave like the gsutil batch mode
so I could use
** gcloud storage cp gs://<source bucket>/object1 gs://<source bucket>/object2 ...<I tried 1000 objects> gs://<target bucket>/
**
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论