如何获得区分Delta实时表的完全刷新和增量更新的直接方法?

huangapple go评论70阅读模式
英文:

How to obtain a direct way to differentiate between a full refresh and an incremental update for Delta live table?

问题

我有一张表格从铜到银到金,

我想实现一个类似于'is_full_refresh()'的函数,以便管道根据输出对df进行过滤,如果是完全的,就不过滤,如果是增量的,就按a、b、c进行过滤。

在Databricks的文档中查看https://docs.databricks.com/delta-live-tables/settings.html#cluster-config,找不到直接区分完全刷新和增量刷新的方法,

我该如何做到这一点?

英文:

I have a tables that travels from Bronze - silver - gold,

I want to implement some function like 'is_full_refresh()' so the pipeline filters the df depending on the output, if it's a full, don't filter, if it's incremental filter by a,b,c

Checking the documentation on Databricks https://docs.databricks.com/delta-live-tables/settings.html#cluster-config can't find a direct way to differentiate between a full refresh and an incremental,

How can I do that?

答案1

得分: 1

一种选择是依赖于REST API调用 https://docs.databricks.com/api/workspace/pipelines/getupdate

另一种选择我会尝试(需要一些调查)是查询DLT事件日志:https://docs.databricks.com/delta-live-tables/observability.html。我猜一些日志事件可能会包含这些信息。

英文:

One option would be to rely on REST API call https://docs.databricks.com/api/workspace/pipelines/getupdate.

Another option I would try (requires some investigation) is to query DLT event logs: https://docs.databricks.com/delta-live-tables/observability.html. I guess some log events may have this info.

huangapple
  • 本文由 发表于 2023年7月24日 18:37:31
  • 转载请务必保留本文链接:https://go.coder-hub.com/76753648.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定