英文:
Flink SQL-Cli: Hudi is abstract
问题
我试图重新创建使用Hudi(https://hudi.apache.org/docs/flink-quick-start-guide)的Flink常见示例,但当我尝试插入示例数据时出现错误,有人可以帮助我吗?
我在我的AWS EMR集群中遵循的步骤是:
export JVM_ARGS=-Djava.io.tmpdir=/mnt/tmp
sudo aws s3 cp MyBucketLocation/hudi-flink-bundle_2.11-0.10.0.jar /lib/flink/lib/hudi-flink-bundle_2.11-0.10.0.jar
#初始化Sql cli flink
/usr/lib/flink/bin/sql-client.sh
--创建表
CREATE TABLE t1(
uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
age INT,
ts TIMESTAMP(3),
`partition` VARCHAR(20)
)
PARTITIONED BY (`partition`)
WITH (
'connector' = 'hudi',
'path' = 's3://issue-lmdl-s3-ldz/msk/Flink/kafka/',
'table.type' = 'MERGE_ON_READ' -- 这将创建一个MERGE_ON_READ表,默认情况下是COPY_ON_WRITE
);
--按照文档插入
INSERT INTO t1 VALUES
('id1','Danny',23,TIMESTAMP '1970-01-01 00:00:01','par1'),
('id2','Stephen',33,TIMESTAMP '1970-01-01 00:00:02','par1'),
('id3','Julian',53,TIMESTAMP '1970-01-01 00:00:03','par2'),
('id4','Fabian',31,TIMESTAMP '1970-01-01 00:00:04','par2'),
('id5','Sophia',18,TIMESTAMP '1970-01-01 00:00:05','par3'),
('id6','Emma',20,TIMESTAMP '1970-01-01 00:00:06','par3'),
('id7','Bob',44,TIMESTAMP '1970-01-01 00:00:07','par4'),
('id8','Han',56,TIMESTAMP '1970-01-01 00:00:08','par4');
我使用的是EMR 6.8.0,并且sql cli flink已经可以与kafka一起使用,我只想以Hudi格式写入这些记录。
英文:
i'm trying to recreate the flink common example working with hudi (https://hudi.apache.org/docs/flink-quick-start-guide), but when I try to insert the example data an error appears, can someone help me with this?
The steps that I'm following in my AWS EMR cluster are:
export JVM_ARGS=-Djava.io.tmpdir=/mnt/tmp
sudo aws s3 cp MyBucketLocation/hudi-flink-bundle_2.11-0.10.0.jar /lib/flink/lib/hudi-flink-bundle_2.11-0.10.0.jar
#Init the Sql cli flink
/usr/lib/flink/bin/sql-client.sh
--Create table
CREATE TABLE t1(
uuid VARCHAR(20) PRIMARY KEY NOT ENFORCED,
name VARCHAR(10),
age INT,
ts TIMESTAMP(3),
`partition` VARCHAR(20)
)
PARTITIONED BY (`partition`)
WITH (
'connector' = 'hudi',
'path' = 's3://issue-lmdl-s3-ldz/msk/Flink/kafka/',
'table.type' = 'MERGE_ON_READ' -- this creates a MERGE_ON_READ table, by default is COPY_ON_WRITE
);
--Insert as the documentation
INSERT INTO t1 VALUES
('id1','Danny',23,TIMESTAMP '1970-01-01 00:00:01','par1'),
('id2','Stephen',33,TIMESTAMP '1970-01-01 00:00:02','par1'),
('id3','Julian',53,TIMESTAMP '1970-01-01 00:00:03','par2'),
('id4','Fabian',31,TIMESTAMP '1970-01-01 00:00:04','par2'),
('id5','Sophia',18,TIMESTAMP '1970-01-01 00:00:05','par3'),
('id6','Emma',20,TIMESTAMP '1970-01-01 00:00:06','par3'),
('id7','Bob',44,TIMESTAMP '1970-01-01 00:00:07','par4'),
('id8','Han',56,TIMESTAMP '1970-01-01 00:00:08','par4');
I'm working with EMR 6.8.0 and sql cli flink has already worked with kafka, I just want to write this records in hudi format.
答案1
得分: 0
这是一个版本问题,我可以通过将hudi库的版本升级到1.15或更高版本来解决它。
英文:
It's a version problem, I could fix it upgrading the hudi library version to 1.15 or higher
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论