英文:
How to select a nested structure while preserving the nested structure in spark dataframe
问题
我只对 my_value2
和 my_id2
以及 my_id1
感兴趣,但我想保留嵌套结构。
如果我这样做:
val myDf = myoriginalDF
.select("key.*", "value.my_value2")
它会返回:
my_id1
my_id2
my_value2
在选择时如何保留嵌套结构?
英文:
My data look like
key
my_id1
my_id2
value
my_value1
my_value2
I am only interested in my_value2
and my_id2
my_id1
but i want to preserve the nested structure
if i do
val myDf = myoriginalDF
.select("key.*", "value.my_value2")
it returns
my_id1
my_id2
my_value2
how do i preserve the nested structure during select?
答案1
得分: 1
你可以使用以下方式在 struct
中使用 select
:
df.select(
struct($"key.my_id1", $"key.my_id2").alias("key"),
struct($"value.my_value2").alias("value")
)
你的输出架构将会是:
root
|-- key: struct
| |-- my_id1
| |-- my_id2
|-- value: struct
| |-- my_value2
这应该是你想要的,祝你好运!
英文:
You could use select
with struct
as below:
df.select(
struct($"key.my_id1", $"key.my_id2").alias("key"),
struct($"value.my_value2").alias("value")
)
The schema of your output will be:
root
|-- key: struct
| |-- my_id1
| |-- my_id2
|-- value: struct
| |-- my_value2
Which should be what you want, good luck!
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论