英文:
How to get the CSV headers of an S3 Object without downloading entire file?
问题
我在S3上有一个非常大的CSV文件,只需要获取该文件的标题(CSV顶部的具有列名称的行,而不是HTTP标题)。是否有方法在不先下载整个文件的情况下实现这一点?我正在使用Java AWS SDK。我认为这些信息不会存储在对象元数据中,但我可能是错误的。
编辑:
下面选择的答案起作用了,它使用了S3 Select,但对我起作用的查询是:
select s.* from S3Object s limit 1
英文:
I have a very large CSV file in S3, and just need to get the headers of that file (the top row of a CSV that has column names, not HTTP headers). Is there a way to do this without downloading the entire file first? I'm using the Java AWS SDK. I don't think this information is stored in the object metadata, but I may be wrong.
Edit:
The chosen answer below worked, and it used S3 Select, but the query that worked for me was
select s.* from S3Object s limit 1
答案1
得分: 4
你可以使用 S3 select
来查询存储在 AWS S3 中的任何文件的数据。
以下是相同操作的 Java 示例,可以在 aws 文档 中查看。
要从 CSV
文件中选择 列标题,可以将结果限制为 一条记录
。请查看这里的 SELECT 命令。
例如:
QUERY = "select s.* from S3Object s limit 1";
在这里可以查看 不同类型的查询示例。
英文:
You can use S3 select
to query the data from any file stored in AWS S3.
Java example for the same from aws docs.
To select the column headers from a CSV
file, you can limit
the results to one record
. Check here for SELECT command.
For example:
QUERY = "select s.* from S3Object s limit 1";
Check for different type of query examples here.
答案2
得分: 2
我知道您可以从文件中下载一系列字节。然后,您可以下载文件的大约10%(但您需要自己确定这个数字),然后将这些字节转换为字符,然后转为字符串。
输出可能会包括标头+一些值,因此您需要查看如何解析内容,以便只保留标头。
// 从对象获取一系列字节并打印字节。
GetObjectRequest rangeObjectRequest = new GetObjectRequest(bucketName, key)
.withRange(0, 9);
英文:
I know that you can download a range of bytes from the files. So then you can download let's say maybe 10% of the file ( but you'll have to figure out this number by yourself ) and then transform those bytes into chars, then strings.
The output will probably be the header + some values, so you'll have to see how you can parse the content so that you remain only with the header.
// Get a range of bytes from an object and print the bytes.
GetObjectRequest rangeObjectRequest = new GetObjectRequest(bucketName, key)
.withRange(0, 9);
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论