英文:
How do I specify S3 key with wild card while searching in Scala Spark
问题
I have a scala spark code that writes a json file with file name as part-*.json
for example (part-00000-14732361-f017-468a-b948-22d3b6d460dc-c000.json).
I want to do s3.doesObjectExist(buckey, key)
where bucket = xyz
and key = abc/def/part-*.json
.
Looks like s3 doesn't support wildcard search. What is the best way for me to do s3.doesObjectExist(buckey, key) when I don't know the exact file name in S3? There is always only one such json file stored as part-*.json
.
Please help thanks!
英文:
I have a scala spark code that writes a json file with file name as part-*.json
for example (part-00000-14732361-f017-468a-b948-22d3b6d460dc-c000.json).
I want to do
s3.doesObjectExist(buckey, key)
where bucket = xyz
and key = abc/def/part-*.json
.
Looks like s3 doesn't support wildcard search. What is the best way for me to do
s3.doesObjectExist(buckey, key) when I don't know the exact file name in S3? There is always only one such json file stored as part-*.json
.
Please help thanks!
答案1
得分: 0
不可能使用AWS API 来完成这个任务。您必须自行下载对象列表并在自己的端上进行筛选。如果您有大量的对象,可以请求S3清单来获取列表,然后进行筛选。
英文:
Its not possible to do it with AWS API. You have to download the list of objects yourself and do the filtering on your own side. If you have lots of objects, you can request S3 Inventory to get the list and filter that.
答案2
得分: 0
我做了一个变通解决方案
val bucket = "xyz"
val fileNamePrefix = "abc/def/part"
val key = s3.listObjectsV2(bucket, fileNamePrefix).getObjectSummaries.get(0).getKey
由于我提到只有一个这样的文件,上面的代码帮助我获取了完整文件名的完整密钥,我使用它。
英文:
I did a workaround
val bucket = "xyz"
val fileNamePrefix = "abc/def/part"
val key = s3.listObjectsV2(bucket,fileNamePrefix).getObjectSummaries.get(0).getKey
Since I mentioned there only one such file, the above code helped me get the entire key with full file name that I use.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论