英文:
in databricks does array_union maintain order
问题
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-html -->
MERGE INTO final_table a
USING
(
select student_id,array_agg(distinct subject) new_subject
from changes_table b
group by 1,2
) b
on a.student_id = b.student_id
WHEN MATCHED THEN
UPDATE SET a.subject = array_union(array(new_subject),array(subject)),
WHEN NOT MATCHED
THEN INSERT (student_id,subject) VALUES (b.student_id,new_subject)
<!-- end snippet
<!-- begin snippet: js hide: false console: true babel: false -->
changed_table values = ["sql","python"]
Final_table values = ["sql","scala"]
Result values ["sql", "python", "scala"]
The output that I am getting is correct.
Question is that would it maintain the array_union maintain order in databricks ?
英文:
<!-- begin snippet: js hide: false console: true babel: false -->
<!-- language: lang-html -->
MERGE INTO final_table a
USING
(
select student_id,array_agg(distinct subject) new_subject
from changes_table b
group by 1,2
) b
on a.student_id = b.student_id
WHEN MATCHED THEN
UPDATE SET a.subject = array_union(array(new_subject),array(subject)),
WHEN NOT MATCHED
THEN INSERT (student_id,subject) VALUES (b.student_id,new_subject)
<!-- end snippet
<!-- begin snippet: js hide: false console: true babel: false -->
changed_table values = ["sql","python"]
Final_table values = ["sql","scala"]
Result values ["sql", "python", "scala"]
The output that I am getting is correct.
Question is that would it maintain the array_union maintain order in databricks ?
答案1
得分: 0
array_union
是一个Spark函数,我们可以在GitHub上看到它的实现:这里。
如果你仔细检查它,你会看到它是如何按顺序迭代两个数组并将唯一的条目添加到一个数组缓冲区中的,这个缓冲区最终成为最终结果。因此,它确实保留顺序。
英文:
array_union
is a Spark function and we can see its implementation: here on GitHub.
If you examine it, you will see how it's iterating through both arrays in sequence and adding unique entries to an array buffer, which becomes the final result. Therefore yes, it does preserve order.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论