英文:
How to obtain a direct way to differentiate between a full refresh and an incremental update for Delta live table?
问题
我有一张表格从铜到银到金,
我想实现一个类似于'is_full_refresh()'的函数,以便管道根据输出对df进行过滤,如果是完全的,就不过滤,如果是增量的,就按a、b、c进行过滤。
在Databricks的文档中查看https://docs.databricks.com/delta-live-tables/settings.html#cluster-config,找不到直接区分完全刷新和增量刷新的方法,
我该如何做到这一点?
英文:
I have a tables that travels from Bronze - silver - gold,
I want to implement some function like 'is_full_refresh()' so the pipeline filters the df depending on the output, if it's a full, don't filter, if it's incremental filter by a,b,c
Checking the documentation on Databricks https://docs.databricks.com/delta-live-tables/settings.html#cluster-config can't find a direct way to differentiate between a full refresh and an incremental,
How can I do that?
答案1
得分: 1
一种选择是依赖于REST API调用 https://docs.databricks.com/api/workspace/pipelines/getupdate。
另一种选择我会尝试(需要一些调查)是查询DLT事件日志:https://docs.databricks.com/delta-live-tables/observability.html。我猜一些日志事件可能会包含这些信息。
英文:
One option would be to rely on REST API call https://docs.databricks.com/api/workspace/pipelines/getupdate.
Another option I would try (requires some investigation) is to query DLT event logs: https://docs.databricks.com/delta-live-tables/observability.html. I guess some log events may have this info.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论