英文:
Pyspark pivot with Dynamic columns
问题
我有一个Pyspark Dataframe,如下所示,
我正在基于月份和T列进行数据透视,需要生成以下输出。
T列中有一些季度,如q2、q3、q4,但我需要用空值填充它们。它应该是动态的,这些季度应该按最新顺序排列,例如q4 2023将首先出现,然后是q3 2023、q2 2023、q1 2023等等...
我正在使用以下Pyspark代码
FinalDF = df.groupBy("id","month").pivot("T").agg(
F.first("Oil"),
F.first("Gas"),
)
英文:
I have Pyspark Dataframe as follows,
I am pivoting the data based Month and T columns and need to produce the following output.
There are some quarters like q2,q3,q4 are not present in T column but i need to fill them with null values.It should be dynamic and these quarters should be in latest order like q4 2023 will be first then q3 2023,q2 2023,q1 2023 etc...
I amusing the following pyspark code
FinalDF = df.groupBy("id","month").pivot("T").agg(
F.first("Oil"),
F.first("Gas"),
)
答案1
得分: 0
你可以使用 pivot()
的可选参数 values
来传递所需的季度列表。
这里是一个示例:
data_sdf. \
groupBy('id', 'month'). \
pivot('t', values=['q1 2023', 'q2 2023', 'q3 2023', 'q4 2023']). \
agg(func.first('oil').alias('oil'),
func.first('gas').alias('gas')
). \
show()
+---+-----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| id|month|q1 2023_oil|q1 2023_gas|q2 2023_oil|q2 2023_gas|q3 2023_oil|q3 2023_gas|q4 2023_oil|q4 2023_gas|
+---+-----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
| 1| 1| 5| 10| null| null| null| null| null| null|
| 2| 2| 20| 30| null| null| null| null| null| null|
+---+-----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
<details>
<summary>英文:</summary>
you can use the `values` optional parameter of `pivot()` to pass the desired list of quarters.
here's an example
```py
data_sdf. \
groupBy('id', 'month'). \
pivot('t', values=['q1 2023', 'q2 2023', 'q3 2023', 'q4 2023']). \
agg(func.first('oil').alias('oil'),
func.first('gas').alias('gas')
). \
show()
# +---+-----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
# | id|month|q1 2023_oil|q1 2023_gas|q2 2023_oil|q2 2023_gas|q3 2023_oil|q3 2023_gas|q4 2023_oil|q4 2023_gas|
# +---+-----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
# | 1| 1| 5| 10| null| null| null| null| null| null|
# | 2| 2| 20| 30| null| null| null| null| null| null|
# +---+-----+-----------+-----------+-----------+-----------+-----------+-----------+-----------+-----------+
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论