在Java中如何逐行将S3中的CSV文件写入DynamoDB?

huangapple go评论66阅读模式
英文:

How to write an csv file from S3 to DynamoDB line by line in Java?

问题

I want to write the csv lines in DynamoDB table one by one. In order to do that, I want to store each row as an object (StatusObject) in java. Something like for each row, StatusObject.setId(data[id]), StatusObject.setName(data[name]) and then write this object in DynamoDB.

我想逐行将CSV行写入DynamoDB表中。为了做到这一点,我想将每一行存储为Java对象(StatusObject),类似于对于每一行,StatusObject.setId(data[id])StatusObject.setName(data[name]),然后将此对象写入DynamoDB。

英文:

I have csv file in an S3 bucket having the first row as header and rest of the rows as values. For example data.csv :

id | name | age | height
12 | abc  | 23  |  5.7
13 | xyz  | 25  |  5.3

I want to write the csv lines in DynamoDB table one by one. In order to do that, I want to store each row as an object (StatusObject) in java. Something like for each row, StatusObject.setId(data[id]), StatusObject.setName(data[name]) and then write this object in DynamoDB.

I have a ddbMapper, that writes the object in DynamoDB table

ddbMapper.load(StatusObject.class, id);

This is how I retrieve the S3Object:

S3Object s3Object = s3Client.getObject(new GetObjectRequest(s3bucket, s3Key));
S3ObjectInputStream s3ObjectInputStream = s3Object.getObjectContent();

Can someone help me with the following conversion?

s3ObjectInputStream -> StatusObject (the data.csv will have unnecessary columns which I want to avoid and only store ones to match the StatusObject

Thanks.

答案1

得分: 1

在 AWS 代码示例 GitHub 存储库中有一个类似的示例。但是,与读取 CSV 文件不同,它读取位于 S3 存储桶中的 Excel 电子表格,并将数据放入 Amazon DynamoDB 表中。

此示例使用 Amazon DynamoDB Enhanced 客户端 Java API(这是 AWS SDK for Java V2)。ddbMapper 是 Java V1,不再推荐使用。您应该考虑从 ddbMapper 迁移到 Enhanced 客户端。

此示例读取电子表格中的数据:

在Java中如何逐行将S3中的CSV文件写入DynamoDB?

并将数据放入 DynamoDB 表中:

在Java中如何逐行将S3中的CSV文件写入DynamoDB?

此示例将引导您在使用 AWS Java API 从 S3 存储桶中读取数据并将其放入 Amazon DynamoDB 表时朝正确方向前进。

https://github.com/awsdocs/aws-doc-sdk-examples/tree/main/javav2/usecases/Creating_etl_workflow

英文:

There is a similar example in AWS Code Example Github repo. However, instead of reading a CSV file, it reads an Excel spreadsheet that is located in the S3 bucket and places the data into an Amazon DynamoDB table.

This example uses the Amazon DynamoDB Enhanced Client Java API (which is AWS SDK for Java V2). The ddbMapper is Java V1 which is not recommended anymore. You should consider moving away from ddbMapper to the Enhanced Client.

This example reads the data in the speadsheet:

在Java中如何逐行将S3中的CSV文件写入DynamoDB?

and places the data into a DynamoDB table:

在Java中如何逐行将S3中的CSV文件写入DynamoDB?

This example will point you in the right direction in terms of reading data from an S3 bucket and placing the data into an Amazon DynamoDB table using the AWS Java API.

https://github.com/awsdocs/aws-doc-sdk-examples/tree/main/javav2/usecases/Creating_etl_workflow

huangapple
  • 本文由 发表于 2023年6月12日 04:17:54
  • 转载请务必保留本文链接:https://go.coder-hub.com/76452359.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定