将列级别连接到现有的多列 pandas 数据帧。

huangapple go评论79阅读模式
英文:

Concatenate column levels to an existing multi-column pandas dataframe

问题

以下是翻译好的部分:

我有两个带有多列索引的数据框这两个数据框的列数完全相同

import pandas as pd

columns_df1 = pd.MultiIndex.from_tuples([
    ('A', 1, 'X', 'Y', 'Z'),
    ('B', 2, 'X', 'Y', 'Z'),
    ('C', 3, 'X', 'Y', 'Z')
], names=['level1', 'level2', 'level3', 'level4', 'level5'])

df1 = pd.DataFrame([[1, 2, 3]], columns=columns_df1)

columns_df2 = pd.MultiIndex.from_tuples([
    ('D', 4, 'P', 'Q', 'R'),
    ('E', 5, 'P', 'Q', 'R'),
    ('F', 6, 'P', 'Q', 'R')
], names=['level6', 'level7', 'level8', 'level9', 'level10'])

df2 = pd.DataFrame([[4, 5, 6]], columns=columns_df2)


print(df1)

print(df2)


我需要将df2的最后两个列级别level9和level10添加到df1”。最佳方法是什么

期望的结果

level1  A  B  C
level2  1  2  3
level3  X  X  X
level4  Y  Y  Y
level5  Z  Z  Z
level9  Q  Q  Q
level10 R  R  R
0       1  2  3
英文:

I have two dataframes with a multi-column index. Both dataframes have exactly the same number of columns.

import pandas as pd

columns_df1 = pd.MultiIndex.from_tuples([
    ('A', 1, 'X', 'Y', 'Z'),
    ('B', 2, 'X', 'Y', 'Z'),
    ('C', 3, 'X', 'Y', 'Z')
], names=['level1', 'level2', 'level3', 'level4', 'level5'])

df1 = pd.DataFrame([[1, 2, 3]], columns=columns_df1)

columns_df2 = pd.MultiIndex.from_tuples([
    ('D', 4, 'P', 'Q', 'R'),
    ('E', 5, 'P', 'Q', 'R'),
    ('F', 6, 'P', 'Q', 'R')
], names=['level6', 'level7', 'level8', 'level9', 'level10'])

df2 = pd.DataFrame([[4, 5, 6]], columns=columns_df2)


print(df1)

print(df2)


level1  A  B  C
level2  1  2  3
level3  X  X  X
level4  Y  Y  Y
level5  Z  Z  Z
0       1  2  3

level6   D  E  F
level7   4  5  6
level8   P  P  P
level9   Q  Q  Q
level10  R  R  R
0        4  5  6

I need to add the last two column levels (level9 and level10) of df2 to df1. What's the best way to do this?

Expected result:

level1  A  B  C
level2  1  2  3
level3  X  X  X
level4  Y  Y  Y
level5  Z  Z  Z
level9  Q  Q  Q
level10 R  R  R
0       1  2  3

答案1

得分: 4

一种选择是将这两个索引合并在一起:

last = 2
names = df1.columns.names.union(df2.columns.names[-last:])

df1.set_axis(pd.MultiIndex.from_tuples([i1 + i2[-last:] for i1,i2 in list(zip(df1.columns,df2.columns))],names = names),axis=1)

输出结果:

level1   A  B  C
level2   1  2  3
level3   X  X  X
level4   Y  Y  Y
level5   Z  Z  Z
level9   Q  Q  Q
level10  R  R  R
0        1  2  3
英文:

One option is to zip the two indexes together:

last = 2
names = df1.columns.names.union(df2.columns.names[-last:])

df1.set_axis(pd.MultiIndex.from_tuples([i1 + i2[-last:] for i1,i2 in list(zip(df1.columns,df2.columns))],names = names),axis=1)

Output:

level1   A  B  C
level2   1  2  3
level3   X  X  X
level4   Y  Y  Y
level5   Z  Z  Z
level9   Q  Q  Q
level10  R  R  R
0        1  2  3

答案2

得分: 0

使用pandas.MultiIndex.to_frame/from_frameconcat将DataFrames用作中间件:

new_idx = pd.MultiIndex.from_frame(pd.concat([df1.columns.to_frame(index=False),
                                              df2.columns.to_frame(index=False).iloc[:, -2:],
                                             ], axis=1))

out = df1.set_axis(new_idx, axis=1)

输出:

level1   A  B  C
level2   1  2  3
level3   X  X  X
level4   Y  Y  Y
level5   Z  Z  Z
level9   Q  Q  Q
level10  R  R  R
0        1  2  3
英文:

Using DataFrames as intermediates with pandas.MultiIndex.to_frame/from_frame and concat:

new_idx = pd.MultiIndex.from_frame(pd.concat([df1.columns.to_frame(index=False),
                                              df2.columns.to_frame(index=False).iloc[:, -2:],
                                             ], axis=1))

out = df1.set_axis(new_idx, axis=1)

Output:

level1   A  B  C
level2   1  2  3
level3   X  X  X
level4   Y  Y  Y
level5   Z  Z  Z
level9   Q  Q  Q
level10  R  R  R
0        1  2  3

huangapple
  • 本文由 发表于 2023年7月31日 22:34:33
  • 转载请务必保留本文链接:https://go.coder-hub.com/76804639.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定