英文:
'DataFrame' object has no attribute 'merge'
问题
我是新手使用 PySpark,尝试使用 merge 函数将一个数据框合并到 Delta 位置中的一个数据框。
DEV_Delta.alias("t").merge(df_from_pbl.alias("s"), condition_dev)\
.whenMatchedUpdateAll() \
.whenNotMatchedInsertAll()\
.execute()
这两个数据框具有相同数量的列,但当我在笔记本中运行这个特定命令时,我收到以下错误消息:'DataFrame' object has no attribute 'merge'
。
我找不到关于这个特定任务的解决方案,因此提出一个新问题。你能帮我找出这个问题吗?
谢谢,
Afras Khan
英文:
I am new to PySpark and i am trying to merge a dataframe to the one present in Delta location using the merge function.
DEV_Delta.alias("t").merge(df_from_pbl.alias("s"),condition_dev)\
.whenMatchedUpdateAll() \
.whenNotMatchedInsertAll()\
.execute()
Both the dataframes have equal number of columns but when i run this particular command in my notebook i get the following error
'DataFrame' object has no attribute 'merge'
I couldnt find solutions for this particular task and hence raising a new question. Could you please help me figuring out this issue?
Thanks,
Afras Khan
答案1
得分: 0
你需要拥有 DeltaTable
类的一个实例,但你却传递了 DataFrame。为此,你需要使用 DeltaTable.forPath
(指向特定路径)或 DeltaTable.forName
(用于命名表)来创建它,就像这样:
DEV_Delta = DeltaTable.forPath(spark, '某个路径')
DEV_Delta.alias("t").merge(df_from_pbl.alias("s"), condition_dev)\
.whenMatchedUpdateAll() \
.whenNotMatchedInsertAll()\
.execute()
如果你只有一个 DataFrame 作为数据,你需要先将它们写入。
更多详细信息请参阅文档。
英文:
You need to have an instance of the DeltaTable
class, but you're passing the DataFrame instead. For this you need to create it using the DeltaTable.forPath
(pointing to a specific path) or DeltaTable.forName
(for a named table), like this:
DEV_Delta = DeltaTable.forPath(spark, 'some path')
DEV_Delta.alias("t").merge(df_from_pbl.alias("s"),condition_dev)\
.whenMatchedUpdateAll() \
.whenNotMatchedInsertAll()\
.execute()
If you have data as DataFrame only, you need to write them first.
See documentation for more details.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论