如何在Python中根据特定列取消堆叠表格

huangapple go评论63阅读模式
英文:

How to unstack the table based on specific columns in python

问题

我有一个名为df1的表格

ID M1 M2 NO DTime Result(P2-P3) (P1-P2) (P1-P7) (P3-P7)
2801596 288 371 536529 08-02-2023 11:07 1 NaN 0.085 NaN NaN
2801596 289 371 536529 08-02-2023 11:07 1 1 0.032 1.081 NaN
2801584 290 372 541278 08-02-2023 11:10 1 NaN 0.081 NaN NaN
2801584 291 372 541278 08-02-2023 11:10 0 1 0.037 1.065 NaN
2801598 288 371 541279 08-02-2023 11:12 1 NaN 0.076 NaN NaN
2801599 288 371 555623 08-02-2023 11:14 1 1 NaN NaN 3.871
2801599 289 371 555623 08-02-2023 11:14 1 1 NaN NaN 2.389
2801600 291 372 555624 08-02-2023 11:18 1 NaN 0.0835 NaN NaN

我尝试使用透视表,但它给出了一个充满NaN的表格。

df2 = pd.pivot_table(df1, values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'], index=['ID','No','DTime'], columns=['M2','M1'], aggfunc='first')

我想要这个表格作为输出。

如何在Python中根据特定列取消堆叠表格
1: https://i.stack.imgur.com/kM9UU.png

英文:

I have a table like this name df1

ID	     M1	 M2   NO	    DTime	      Result(P2-P3)	(P1-P2)	(P1-P7) (P3-P7) 
2801596	288	371	 536529	08-02-2023 11:07	1	 NaN	0.085	 NaN	  NaN
2801596	289	371	 536529	08-02-2023 11:07	1	  1	    0.032 	1.081	  NaN
2801584	290	372	 541278	08-02-2023 11:10	1	 NaN	0.081	 NaN	  NaN
2801584	291	372	 541278	08-02-2023 11:10	0	  1	    0.037	1.065	  NaN
2801598	288	371	 541279	08-02-2023 11:12	1	 NaN	0.076	 NaN	  NaN
2801599	288	371	 555623	08-02-2023 11:14	1	  1	     NaN	 NaN	 3.871
2801599	289	371	 555623	08-02-2023 11:14	1	  1	     NaN	 NaN	 2.389
2801600	291	372	 555624	08-02-2023 11:18	1	 NaN	0.0835   NaN	  NaN

I have tried using Pivot table but it is giving a table full of Nan.

df2 = pd.pivot_table(df1, values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'], index=['ID','No','DTime'],columns=['M2','M1'], aggfunc='first')

I want this table as output
如何在Python中根据特定列取消堆叠表格

答案1

得分: 2

DataFrame.reorder_levelsDataFrame.sort_index一起使用:

df2 = df1.pivot_table(values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'],
                     index=['ID','NO','DTime'],
                     columns=['M2','M1'],
                     aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)

如果需要处理输入DataFrame中未在indexcolumns参数中设置的所有列,可以省略values

df2 = df1.pivot_table(index=['ID','NO','DTime'],
                     columns=['M2','M1'],
                     aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)

输入数据:

df1 = pd.DataFrame({'ID': {0: 2801596, 1: 2801596, 2: 2801584, 3: 2801584, 4: 2801598, 5: 2801599, 6: 2801599, 7: 2801600}, 
                    'M1': {0: 288, 1: 289, 2: 290, 3: 291, 4: 288, 5: 288, 6: 289, 7: 291},
                    'M2': {0: 371, 1: 371, 2: 372, 3: 372, 4: 371, 5: 371, 6: 371, 7: 372}, 
                    'NO': {0: 536529, 1: 536529, 2: 541278, 3: 541278, 4: 541279, 5: 555623, 6: 555623, 7: 555624}, 
                    'DTime': {0: '08-02-2023 11:07', 1: '08-02-2023 11:07', 2: '08-02-2023 11:10', 3: '08-02-2023 11:10', 
                              4: '08-02-2023 11:12', 5: '08-02-2023 11:14', 6: '08-02-2023 11:14', 7: '08-02-2023 11:18'}, 
                    'Result': {0: 1, 1: 1, 2: 1, 3: 0, 4: 1, 5: 1, 6: 1, 7: 1}, 
                    '(P2-P3)': {0: np.nan , 1: 1.0, 2: np.nan , 3: 1.0, 4: np.nan , 5: 1.0, 6: 1.0, 7: np.nan }, 
                    '(P1-P2)': {0: 0.085, 1: 0.032, 2: 0.081, 3: 0.037, 4: 0.076, 5: np.nan , 6: np.nan , 7: 0.0835},
                    '(P1-P7)': {0: np.nan , 1: 1.081, 2: np.nan , 3: 1.065, 4: np.nan , 5: np.nan , 6: np.nan , 7: np.nan },
                    '(P3-P7)': {0: np.nan , 1: np.nan , 2: np.nan , 3: np.nan , 4: np.nan , 5: 3.871, 6: 2.389, 7: np.nan }})
英文:

Add DataFrame.reorder_levels with DataFrame.sort_index:

df2 = df1.pivot_table(values=['Result','(P2-P3)','(P1-P2)','(P1-P7)','(P3-P7)'],
                     index=['ID','NO','DTime'],
                     columns=['M2','M1'],
                     aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)

If need processing all columns from input DataFrame not set in index and columns parameter is possible omit values:

df2 = df1.pivot_table(index=['ID','NO','DTime'],
                     columns=['M2','M1'],
                     aggfunc='first').reorder_levels([1,2,0], axis=1).sort_index(axis=1)

Input data:

df1 = pd.DataFrame({'ID': {0: 2801596, 1: 2801596, 2: 2801584, 3: 2801584, 4: 2801598, 5: 2801599, 6: 2801599, 7: 2801600}, 
                    'M1': {0: 288, 1: 289, 2: 290, 3: 291, 4: 288, 5: 288, 6: 289, 7: 291},
                    'M2': {0: 371, 1: 371, 2: 372, 3: 372, 4: 371, 5: 371, 6: 371, 7: 372}, 
                    'NO': {0: 536529, 1: 536529, 2: 541278, 3: 541278, 4: 541279, 5: 555623, 6: 555623, 7: 555624}, 
                    'DTime': {0: '08-02-2023 11:07', 1: '08-02-2023 11:07', 2: '08-02-2023 11:10', 3: '08-02-2023 11:10', 
                              4: '08-02-2023 11:12', 5: '08-02-2023 11:14', 6: '08-02-2023 11:14', 7: '08-02-2023 11:18'}, 
                    'Result': {0: 1, 1: 1, 2: 1, 3: 0, 4: 1, 5: 1, 6: 1, 7: 1}, 
                    '(P2-P3)': {0: np.nan , 1: 1.0, 2: np.nan , 3: 1.0, 4: np.nan , 5: 1.0, 6: 1.0, 7: np.nan }, 
                    '(P1-P2)': {0: 0.085, 1: 0.032, 2: 0.081, 3: 0.037, 4: 0.076, 5: np.nan , 6: np.nan , 7: 0.0835},
                    '(P1-P7)': {0: np.nan , 1: 1.081, 2: np.nan , 3: 1.065, 4: np.nan , 5: np.nan , 6: np.nan , 7: np.nan },
                    '(P3-P7)': {0: np.nan , 1: np.nan , 2: np.nan , 3: np.nan , 4: np.nan , 5: 3.871, 6: 2.389, 7: np.nan }})

huangapple
  • 本文由 发表于 2023年6月12日 17:10:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/76455131.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定