英文:
How to unstack the table based on specific columns in python
问题
我有一个名为df1的表格
ID M1 M2 NO DTime Result(P2-P3) (P1-P2) (P1-P7) (P3-P7)
2801596 288 371 536529 08-02-2023 11:07 1 NaN 0.085 NaN NaN
2801596 289 371 536529 08-02-2023 11:07 1 1 0.032 1.081 NaN
2801584 290 372 541278 08-02-2023 11:10 1 NaN 0.081 NaN NaN
2801584 291 372 541278 08-02-2023 11:10 0 1 0.037 1.065 NaN
2801598 288 371 541279 08-02-2023 11:12 1 NaN 0.076 NaN NaN
2801599 288 371 555623 08-02-2023 11:14 1 1 NaN NaN 3.871
2801599 289 371 555623 08-02-2023 11:14 1 1 NaN NaN 2.389
2801600 291 372 555624 08-02-2023 11:18 1 NaN 0.0835 NaN NaN
我尝试使用透视表,但它给出了一个充满NaN的表格。
df2 = pd.pivot_table(df1, values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'], index=['ID','No','DTime'], columns=['M2','M1'], aggfunc='first')
我想要这个表格作为输出。
1: https://i.stack.imgur.com/kM9UU.png
英文:
I have a table like this name df1
ID M1 M2 NO DTime Result(P2-P3) (P1-P2) (P1-P7) (P3-P7)
2801596 288 371 536529 08-02-2023 11:07 1 NaN 0.085 NaN NaN
2801596 289 371 536529 08-02-2023 11:07 1 1 0.032 1.081 NaN
2801584 290 372 541278 08-02-2023 11:10 1 NaN 0.081 NaN NaN
2801584 291 372 541278 08-02-2023 11:10 0 1 0.037 1.065 NaN
2801598 288 371 541279 08-02-2023 11:12 1 NaN 0.076 NaN NaN
2801599 288 371 555623 08-02-2023 11:14 1 1 NaN NaN 3.871
2801599 289 371 555623 08-02-2023 11:14 1 1 NaN NaN 2.389
2801600 291 372 555624 08-02-2023 11:18 1 NaN 0.0835 NaN NaN
I have tried using Pivot table but it is giving a table full of Nan.
df2 = pd.pivot_table(df1, values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'], index=['ID','No','DTime'],columns=['M2','M1'], aggfunc='first')
答案1
得分: 2
将DataFrame.reorder_levels
与DataFrame.sort_index
一起使用:
df2 = df1.pivot_table(values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'],
index=['ID','NO','DTime'],
columns=['M2','M1'],
aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)
如果需要处理输入DataFrame中未在index
和columns
参数中设置的所有列,可以省略values
:
df2 = df1.pivot_table(index=['ID','NO','DTime'],
columns=['M2','M1'],
aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)
输入数据:
df1 = pd.DataFrame({'ID': {0: 2801596, 1: 2801596, 2: 2801584, 3: 2801584, 4: 2801598, 5: 2801599, 6: 2801599, 7: 2801600},
'M1': {0: 288, 1: 289, 2: 290, 3: 291, 4: 288, 5: 288, 6: 289, 7: 291},
'M2': {0: 371, 1: 371, 2: 372, 3: 372, 4: 371, 5: 371, 6: 371, 7: 372},
'NO': {0: 536529, 1: 536529, 2: 541278, 3: 541278, 4: 541279, 5: 555623, 6: 555623, 7: 555624},
'DTime': {0: '08-02-2023 11:07', 1: '08-02-2023 11:07', 2: '08-02-2023 11:10', 3: '08-02-2023 11:10',
4: '08-02-2023 11:12', 5: '08-02-2023 11:14', 6: '08-02-2023 11:14', 7: '08-02-2023 11:18'},
'Result': {0: 1, 1: 1, 2: 1, 3: 0, 4: 1, 5: 1, 6: 1, 7: 1},
'(P2-P3)': {0: np.nan , 1: 1.0, 2: np.nan , 3: 1.0, 4: np.nan , 5: 1.0, 6: 1.0, 7: np.nan },
'(P1-P2)': {0: 0.085, 1: 0.032, 2: 0.081, 3: 0.037, 4: 0.076, 5: np.nan , 6: np.nan , 7: 0.0835},
'(P1-P7)': {0: np.nan , 1: 1.081, 2: np.nan , 3: 1.065, 4: np.nan , 5: np.nan , 6: np.nan , 7: np.nan },
'(P3-P7)': {0: np.nan , 1: np.nan , 2: np.nan , 3: np.nan , 4: np.nan , 5: 3.871, 6: 2.389, 7: np.nan }})
英文:
Add DataFrame.reorder_levels
with DataFrame.sort_index
:
df2 = df1.pivot_table(values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'],
index=['ID','NO','DTime'],
columns=['M2','M1'],
aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)
If need processing all columns from input DataFrame not set in index
and columns
parameter is possible omit values
:
df2 = df1.pivot_table(index=['ID','NO','DTime'],
columns=['M2','M1'],
aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)
Input data:
df1 = pd.DataFrame({'ID': {0: 2801596, 1: 2801596, 2: 2801584, 3: 2801584, 4: 2801598, 5: 2801599, 6: 2801599, 7: 2801600},
'M1': {0: 288, 1: 289, 2: 290, 3: 291, 4: 288, 5: 288, 6: 289, 7: 291},
'M2': {0: 371, 1: 371, 2: 372, 3: 372, 4: 371, 5: 371, 6: 371, 7: 372},
'NO': {0: 536529, 1: 536529, 2: 541278, 3: 541278, 4: 541279, 5: 555623, 6: 555623, 7: 555624},
'DTime': {0: '08-02-2023 11:07', 1: '08-02-2023 11:07', 2: '08-02-2023 11:10', 3: '08-02-2023 11:10',
4: '08-02-2023 11:12', 5: '08-02-2023 11:14', 6: '08-02-2023 11:14', 7: '08-02-2023 11:18'},
'Result': {0: 1, 1: 1, 2: 1, 3: 0, 4: 1, 5: 1, 6: 1, 7: 1},
'(P2-P3)': {0: np.nan , 1: 1.0, 2: np.nan , 3: 1.0, 4: np.nan , 5: 1.0, 6: 1.0, 7: np.nan },
'(P1-P2)': {0: 0.085, 1: 0.032, 2: 0.081, 3: 0.037, 4: 0.076, 5: np.nan , 6: np.nan , 7: 0.0835},
'(P1-P7)': {0: np.nan , 1: 1.081, 2: np.nan , 3: 1.065, 4: np.nan , 5: np.nan , 6: np.nan , 7: np.nan },
'(P3-P7)': {0: np.nan , 1: np.nan , 2: np.nan , 3: np.nan , 4: np.nan , 5: 3.871, 6: 2.389, 7: np.nan }})
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论