英文:
Triggering a Databricks Delta Live Table from Azure Data Factory resets the whole tables. How do I disable that?
问题
我在Azure数据工厂中创建了一个流水线,通过Web活动触发Azure Databricks中的Delta Live表,该方法在微软文档中有提到,详细信息请参考此链接。
我的问题是,当我从ADF触发我的DLT时,它会重置整个表格,这意味着在流水线执行期间我的数据将不可用。更明确地说,它在下面的截图中有这一额外步骤:
但是,当我直接从Databricks用户界面运行它时,表格不会被重置,数据在流水线执行期间仍然可用。下面是演示:
我希望在ADF中实现与直接从Databricks用户界面触发流水线时相同的行为。当我从ADF触发它时,我不希望在我的DLT流水线中出现这个额外的"重置表格"步骤。是否有人有解决方法?
英文:
I have created a pipeline in Azure Data Factory that triggers a Delta Live Table in Azure Databricks through a Web activity mentioned here in the Microsoft documentation.
My problem is that when I trigger my DLT from ADF, it resets the whole tables, meaning that my data becomes unavailable during the pipeline execution. To be more clear, it has this additional step in the screenshot below:
However, when I run it directly from the Databricks UI, the tables will not get reset and the data is available during the execution of my pipeline. Here's how it looks like:
I would like to have the same behavior in ADF, as I have when trigger the pipeline directly from the Databricks UI. I don't want to have this additional "resetting tables" step in my DLT pipeline when I trigger it from ADF.
Anyone has any solution for this?
答案1
得分: 1
你的网页活动参数中似乎添加了{"full_refresh": "true"}
,这将始终执行完全刷新。要避免这种情况,只需传递空对象(如{}
)即可。
英文:
It looks like you have {"full_refresh": "true"}
added to your web activity parameters - with this it will always do a full refresh. To avoid it, just pass the empty object (as {}
) instead.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论