ADF数据流是否多次读取源数据?

huangapple go评论68阅读模式
英文:

Does ADF dataflow reads the source multiple times?

问题

如果我从数据流源(比如cosmosDB)读取数据,并根据筛选条件将源数据分成两个数据集流,数据流是否会两次读取源数据以应用两个筛选条件?由于数据流使用Spark,我认为除非您将源数据保存在Spark集群中,否则必须两次读取。ADF数据流是否也会保存中间转换结果?

英文:

If I read from dataflow source, say cosmosDB, and branch the source data into two streams of dataset based on filter conditions, does dataflow reads the source twice to apply two filter conditions ? As dataflow uses spark, I believe unless you save the source data in spark cluster, the read has to happen twice. Does ADF dataflow saves intermediate transformations as well ?

答案1

得分: 1

如果您使用两次源转换添加相同的源数据,那么它将会像下面这样读取源数据两次:

ADF数据流是否多次读取源数据?

如果您使用条件拆分或筛选转换来拆分数据,它不会读取两次。它将视为单一源数据。

英文:

If you Add the same source twice using two source transformation , then only it will read the source data twice like below:

ADF数据流是否多次读取源数据?

If you use conditional split, or filter transformation, and split the data, it wont read it twice. It will consider it as single source

huangapple
  • 本文由 发表于 2023年7月7日 01:42:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76631332.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定