How to copy objects across gcp buckets preserving metadata. gsutil cp drops custom metadata keys wilth null values

huangapple go评论65阅读模式
英文:

How to copy objects across gcp buckets preserving metadata. gsutil cp drops custom metadata keys wilth null values

问题

I need to copy data across Google Cloud Platform - Cloud Storage(GCS) buckets (source is a GCS bucket and destination is a GCS bucket)
Since I perform copy along some more operations in small batches I use the gsutil cp command from bash shell script

The exact command I use is as follows

objpaths_file has object paths as gs://source_bucket/obj1, ...

objlist=objpaths_file
cat $objlist| gsutil -m cp -I gs://target_bucket

The objects to be copied have custom metadata fields.
This way of copying objects using "gsutil cp" does copy custom metadata key values, provided the metadata key has an associated non-null value
In case a custom metadata key has a null value then the copied metadata does not have that key in the destination (the key with null value is dropped from copy)

So my questions are

  • Is there any other mechanism that will allow me to programmatically copy the objects with custom metadata with all keys (regardless of the key value being NULL)?
  • Is there an option to change this behavior of the gsutil cp command?
  • Alternatively, I am also open to suggestions for recreating missing metadata keys with and filling those with null values programmatically in the destination bucket. Of course, this option should only add missing fields with key and null values but leave key-value pairs with valid values intact!

And another less relevant question How to copy objects across gcp buckets preserving metadata. gsutil cp drops custom metadata keys wilth null values

  • Would this gsutil behavior (skipping custom metadata key if value is NULL) be expected behavior, or would it rather amount to unexpected behavior/defect? should I approach Google support seeking a fix in that case?

Thanks for your response

Yogesh

英文:

I need to copy data across Google Cloud Platform - Cloud Storage(GCS) buckets (source is a GCS bucket and destination is a GCS bucket)
Since I perform copy along some more operations in small batches I use the gsutil cp command from bash shell script

The exact command I use is as follows

# objpaths_file has object paths as gs://source_bucket/obj1, ...
objlist=objpaths_file
cat $objlist| gsutil -m cp -I gs://target_bucket

The objects to be copied have custom metadata fields.
This way of copying objects using "gsutil cp" does copy custom metadata key values, provided the metadata key has an associated non null value
In case a custom metadata key has null value then the copied metadata does not have that key in the destination (the key with null value is dropped from copy)

So my questions are

  • Is there any other mechanism that will allow me to programatically copy the objects with custom metadata with all keys (regardless of the key value being NULL) ?
  • Is there an option to change this behaviour of the gsutil cp command ?
  • Alternatively I am also open to suggestions for recreating missing metadata keys with and filling those with null values programatically in destination bucket. Offcourse this option should only add missing fields with key and null values but leave key-value pairs with valid values intact !!

And another less relevant question How to copy objects across gcp buckets preserving metadata. gsutil cp drops custom metadata keys wilth null values

  • Would this gsutil behaviour (skipping custom metadata key if value is NULL) be expected behaviour , or would it rather amount to an unexpected behaviour/ defect? should I approach google support seeking a fix in that case ?)

Thanks for your response

Yogesh

答案1

得分: 0

由于gsutil cp命令丢失了元数据,请改用gcloud storage cp命令。下面的命令将复制所有元数据字段,包括具有NULL值的键。

** gcloud storage cp gs://<source bucket>/objectname gs://<target bucket>/ **

而且gcloud storage cp允许多个参数,可以像gsutil批处理模式一样使用,所以我可以使用以下方式:

** gcloud storage cp gs://<source bucket>/object1 gs://<source bucket>/object2 ...<我尝试了1000个对象> gs://<target bucket>/ **

英文:

since gsutil cp command loses metadata

use the gcloud storage cp command instead.

Below command will copy all metadata fields including keys with NULL values

** gcloud storage cp gs://&lt;source bucket&gt;/objectname gs://&lt;target bucket&gt;/ **

and gcloud storage cp allows multiple arguments to behave like the gsutil batch mode
so I could use
** gcloud storage cp gs://&lt;source bucket&gt;/object1 gs://&lt;source bucket&gt;/object2 ...&lt;I tried 1000 objects&gt; gs://&lt;target bucket&gt;/ **

huangapple
  • 本文由 发表于 2023年6月27日 20:06:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76564690.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定