英文:
Flink Java API - Pojo Type to Tuple Datatype
问题
以下是您提供的内容的翻译部分:
我正在使用JAVA Flink API创建一个小型实用程序,以学习其功能。我试图读取CSV文件并将其打印出来,为数据的结构我已经开发了一个POJO类。当我执行代码时,我看到的不是正确的值。(整数值被替换为零,字符串的空值也被替换为零)。我该如何映射属性的数据类型?
我的主要类:
package org.karthick.flinkLab;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import javax.xml.crypto.Data;
public class CSVFileRead {
public static void main(String[] args) throws Exception {
System.out.println("--使用Flink的数据集API读取CSV文件--");
ExecutionEnvironment execEnv = ExecutionEnvironment.getExecutionEnvironment();
DataSet<DataModel> csvInput = execEnv.readCsvFile("C:\\Flink\\Data\\IndividualDetails.csv")
.pojoType(DataModel.class);
csvInput.print();
}
}
我的POJO类(DataModel.class)
package org.karthick.flinkLab;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.api.java.tuple.Tuple12;
import java.io.Serializable;
import java.util.Date;
public class DataModel<T extends Tuple>
extends Tuple12<Integer,String,Date,Integer,String,String,String,String,String,String,Date,String>
implements Serializable
{
public Integer id;
public String government_id;
public Date diagnosed_date;
public Integer age;
public String detected_city;
public String detected_district;
public String detected_state;
public String nationality;
public String current_status;
public Date status_change_date;
public String notes;
public DataModel() {};
public String getNotes() {
return notes;
}
public Date getStatus_change_date() {
return status_change_date;
}
public String getCurrent_status() {
return current_status;
}
public String getNationality() {
return nationality;
}
public String getDetected_state() {
return detected_state;
}
public String getDetected_district() {
return detected_district;
}
public String getDetected_city() {
return detected_city;
}
public String gender ;
public Date getDiagnosed_date() {
return diagnosed_date;
}
public String getGender() {
return gender;
}
public Integer getAge() {
return age;
}
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getGovernment_id() {
return government_id;
}
public void setGovernment_id(String government_id) {
this.government_id = government_id;
}
}
当我执行主方法时,我看不到正确的值。示例结果:
(0,,Tue May 19 16:50:38 IST 2020,0,,,,,,,Tue May 19 16:50:38 IST 2020,)
而我期望看到类似于:
(2777,AP,Tue May 19 16:50:38 IST 2020,0,A,B,C,D,E,F,Tue May 19 16:50:38 IST 2020,G)
这里可能缺少什么?
英文:
I am creating a small utility on JAVA flink API to learn the functionalities. I am trying to read csv file and just print it and I have developed a POJO class for the structure of the data. When I executed the code, I dont see the right values.(Integers values are replaced with zeros and null values for String. How do I map the datatype for the attributes
My Main Class:
package org.karthick.flinkLab;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import javax.xml.crypto.Data;
public class CSVFileRead {
public static void main(String[] args) throws Exception {
System.out.println("--CSV File Reader using Flink's Data Set API--");
ExecutionEnvironment execEnv = ExecutionEnvironment.getExecutionEnvironment();
DataSet<DataModel> csvInput = execEnv.readCsvFile("C:\\Flink\\Data\\IndividualDetails.csv")
.pojoType(DataModel.class);
csvInput.print();
}
}
My Pojo class (DataModel.class)
package org.karthick.flinkLab;
import org.apache.flink.api.java.tuple.Tuple;
import org.apache.flink.api.java.tuple.Tuple12;
import java.io.Serializable;
import java.util.Date;
public class DataModel<T extends Tuple>
extends Tuple12<Integer,String,Date,Integer,String,String,String,String,String,String,Date,String>
implements Serializable
{
public Integer id;
public String government_id;
public Date diagnosed_date;
public Integer age;
public String detected_city;
public String detected_district;
public String detected_state;
public String nationality;
public String current_status;
public Date status_change_date;
public String notes;
public DataModel() {};
public String getNotes() {
return notes;
}
public Date getStatus_change_date() {
return status_change_date;
}
public String getCurrent_status() {
return current_status;
}
public String getNationality() {
return nationality;
}
public String getDetected_state() {
return detected_state;
}
public String getDetected_district() {
return detected_district;
}
public String getDetected_city() {
return detected_city;
}
public String gender ;
public Date getDiagnosed_date() {
return diagnosed_date;
}
public String getGender() {
return gender;
}
public Integer getAge() {
return age;
}
public Integer getId() {
return id;
}
public void setId(Integer id) {
this.id = id;
}
public String getGovernment_id() {
return government_id;
}
public void setGovernment_id(String government_id) {
this.government_id = government_id;
}
}
When I executed the main method, I dont see the proper values. Sample result
(0,,Tue May 19 16:50:38 IST 2020,0,,,,,,,Tue May 19 16:50:38 IST 2020,)
where as I expect something like
(2777,AP,Tue May 19 16:50:38 IST 2020,0,A,B,C,D,E,F,Tue May 19 16:50:38 IST 2020,G)
What could be missing here?
答案1
得分: 1
你缺少从CSV到POJO的列映射。添加映射即可生效。列名称的映射必须遵循以下两个规则:
- 列名称应与POJO中的名称完全相同。
- 映射中列的顺序应与CSV文件中的顺序完全相同。
你可以按以下方式定义映射:
DataSet<DataModel> csvInput = execEnv.readCsvFile("C:\\Flink\\Data\\IndividualDetails.csv")
.pojoType(DataModel.class, "id", "age",......);
虽然本应该抛出错误,但实际并未抛出。这可能是一个bug。
英文:
You are missing the column mapping from CSV to POJO. Adding the mapping will work. The mapping of the column names must follow the following two rules:
- The column names should be exactly the same names as in POJO.
- The order of the columns in the mapping should be exactly the same as in the CSV file.
You can define the mapping as follows:
DataSet<DataModel> csvInput = execEnv.readCsvFile("C:\\Flink\\Data\\IndividualDetails.csv")
.pojoType(DataModel.class, "id", "age",.........);
It should have thrown error but it hasn't. It could be a bug
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论