触发 Snowflake 任务在 snowpipe 完成运行后。

huangapple go评论52阅读模式
英文:

How to trigger Snowflake task after snowpipe completes its run

问题

我是新手对Snowflake不太了解,我正在尝试从S3加载数据到表A,并触发一个任务从表A加载数据到表B。
我已经成功实现了一个自动化的Snowpipe,可以在数据到达S3后立即将数据加载到表A中。它使用了一个监听S3存储桶中任何变化的SNS,然后将事件发送给Snowpipe,触发管道中的复制功能,将数据加载到表A中。

目标是在Snowpipe完成其运行并截断表A后触发一个任务,从表A加载数据到表B。我不确定如何解决这个问题,因为我担心在任务运行时可能会有一些数据流入S3,并且在截断表A时可能会删除额外的数据。

英文:

I am new to snowflake and I am trying to load data from S3 into Table A and trigger a Task to load data into table B from Table A.
I have successfully implemented an automated snow pipe that loads data into Table A from an S3 immediately after data arrives in S3. It uses as SNS that listens to any changes in the S3 bucket and then sends an event to Snowpipe which triggers the copy into function in the pipe and loads data into table A.

The objective is to trigger a Task that loads data into Table B from Table A when snowpipe completes its run and then truncates tables A. I'm unsure how to approach this as I am concerned some data might flow into the s3 while the Task is running and when its truncates table A it may delete additional data that came in.

答案1

得分: 0

只返回翻译好的部分:

根据数据摄入的频率,您可以在TableA上创建一个流,这个流执行CDC(更改数据捕获),因此记录表上发生的任何DML操作。

然后,您可以创建一个任务,仅在流中有数据时触发,任务内部可以调用一个存储过程,将TableA的数据合并到TableB。

英文:

Depending on the frequency of the data ingestion but you could create a stream over TableA, this stream does CDC (change data capture) so it records any dml that happens on the table.

Then you could create a task that is triggered only when the stream has data, inside that task you could call a stored procedure that merge the data from TableA to TableB

huangapple
  • 本文由 发表于 2023年3月7日 01:05:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/75653754.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定