英文:
Transform rows categories to column while preserving rest of the data frame python
问题
我有以下数据框:
时间 分组 实体 GC Seg 类别 年 季度 指标名称 价值
0 2021-06-01 KRO CO P_GA None Model_Q2_2021 2021 2 yhat 568759.481223
1 2021-07-01 KRO CO P_GA None Model_Q2_2021 2021 3 yhat 586003.965652
2 2021-08-01 KRO CO P_GA None Model_Q2_2021 2021 3 yhat 583703.420655
3 2021-09-01 KRO CO P_GA None Model_Q2_2021 2021 3 y 608601.857510
4 2021-10-01 KRO CO P_GA None Model_Q2_2021 2021 4 y 628928.602344
我想将“指标名称”类别转换为列,以便将它们与其他列的相应值相加。
我尝试了“pivot”和“melt”,但没有得到我想要的结果。
我最接近的尝试是这样的:
grouper = df.groupby('IndicatorName')
out = pd.concat([pd.Series(v['Value'].tolist(), name=k) for k, v in grouper], axis=1)
但我希望保留所有列,并额外添加“yhat”和“y”列。
任何帮助/建议将不胜感激。
谢谢!
英文:
I have data frame as below
Time Groups Entity GC Seg Category Year Quarter IndicatorName Value
0 2021-06-01 KRO CO P_GA None Model_Q2_2021 2021 2 yhat 568759.481223
1 2021-07-01 KRO CO P_GA None Model_Q2_2021 2021 3 yhat 586003.965652
2 2021-08-01 KRO CO P_GA None Model_Q2_2021 2021 3 yhat 583703.420655
3 2021-09-01 KRO CO P_GA None Model_Q2_2021 2021 3 y 608601.857510
4 2021-10-01 KRO CO P_GA None Model_Q2_2021 2021 4 y 628928.602344
I want to IndicatorName
categories to make them columns in such a way that the corresponding value to them in addtion to the rest of the columns
I tried pivot
, and melt
but nothing gave me desired results.
The closed I have gone was with this
grouper = df.groupby('IndicatorName')
out = pd.concat([pd.Series(v['Value'].tolist(), name=k) for k, v in grouper], axis=1)
y yhat
0 8626.88 5.687595e+05
1 8215.30 5.860040e+05
2 8601.53 5.837034e+05
3 8145.16 6.086019e+05
4 9376.81 6.289286e+05
... ... ...
744 NaN 5.402358e+06
745 NaN 5.796123e+06
746 NaN 5.218829e+06
747 NaN 5.451504e+06
But I want to have all columns preserved and additional columns yhat
and y
Any help/suggestion would be much appreciated.
Thanks in advance!
答案1
得分: 2
你是否正在寻找 pivot_table
:
>>> (df.pivot_table(index=df.columns[:-2].tolist(), columns=['IndicatorName'], values='Value')
.reset_index().rename_axis(columns=None))
Time Groups Entity GC Seg Category Year Quarter y yhat
0 2021-06-01 KRO CO P_GA None Model_Q2_2021 2021 2 NaN 568759.481223
1 2021-07-01 KRO CO P_GA None Model_Q2_2021 2021 3 NaN 586003.965652
2 2021-08-01 KRO CO P_GA None Model_Q2_2021 2021 3 NaN 583703.420655
3 2021-09-01 KRO CO P_GA None Model_Q2_2021 2021 3 608601.857510 NaN
4 2021-10-01 KRO CO P_GA None Model_Q2_2021 2021 4 628928.602344 NaN
如果您想保留 IndicatorName
列,您可以将 [:-2]
更改为 [:-1]
。
英文:
Are you looking for pivot_table
:
>>> (df.pivot_table(index=df.columns[:-2].tolist(), columns=['IndicatorName'], values='Value')
.reset_index().rename_axis(columns=None))
Time Groups Entity GC Seg Category Year Quarter y yhat
0 2021-06-01 KRO CO P_GA None Model_Q2_2021 2021 2 NaN 568759.481223
1 2021-07-01 KRO CO P_GA None Model_Q2_2021 2021 3 NaN 586003.965652
2 2021-08-01 KRO CO P_GA None Model_Q2_2021 2021 3 NaN 583703.420655
3 2021-09-01 KRO CO P_GA None Model_Q2_2021 2021 3 608601.857510 NaN
4 2021-10-01 KRO CO P_GA None Model_Q2_2021 2021 4 628928.602344 NaN
You can change [:-2]
by [:-1]
if you want to keep the IndicatorName
column.
答案2
得分: 0
添加您想要的两列,并根据df['IndicatorName']
的值填充它们的值:
df['yhat'] = df['Value'] * (df['IndicatorName'] == 'yhat')
df['y'] = df['Value'] * (df['IndicatorName'] == 'y')
这应该会得到以下结果:
Time (...) IndicatorName Value yhat y
0 2021-06-01 (...) yhat 568759.48 568759.48 0
1 2021-07-01 (...) yhat 586003.96 586003.96 0
2 2021-08-01 (...) yhat 583703.42 583703.42 0
3 2021-09-01 (...) y 608601.85 0 608601.85
4 2021-10-01 (...) y 628928.60 0 628928.60
英文:
add the two columns you want, and fill them with the value of df['value']
with a condition based on the value of df['IndicatorName']
:
df['yhat'] = df['Value']*(df['IndicatorName'] == 'yhat')
df['y'] = df['Value']*(df['IndicatorName'] == 'y')
should get you:
Time (...) IndicatorName Value yhat y
0 2021-06-01 (...) yhat 568759.48 568759.48 0
1 2021-07-01 (...) yhat 586003.96 586003.96 0
2 2021-08-01 (...) yhat 583703.42 583703.42 0
3 2021-09-01 (...) y 608601.85 0 608601.85
4 2021-10-01 (...) y 628928.60 0 628928.60
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论