英文:
creating a conditional column in a multi-level dataframe
问题
import pandas as pd
import numpy as np
level_2 = ['X', 'Y', 'X', 'Y', 'X', 'Y']
level_1 = ['A', 'A', 'B', 'B', 'C', 'C']
data = [['a1', 2, 'b1', 4, 'c1', 3], ['a2', 16, 'b2', 48, 'c2', 78], ['a3', 10, 'b3', 12, 'c3', 34], ['a4', 114, 'b4', 6, 'c4', 1]]
columns = pd.MultiIndex.from_tuples(list(zip(level_1, level_2)))
df = pd.DataFrame(data, columns=columns)
# Find the two greatest numbers in column [A][Y] or [B][Y] or [C][Y]
greatest_y_values = df[['A', 'B', 'C']]['Y'].nlargest(2)
# Select the corresponding column [A][X] or [B][X] or [C][X]
result = df.loc[greatest_y_values.index][['A', 'B', 'C']]['X']
# Print the result
print(result)
This code will give you the two greatest values in columns [A][Y], [B][Y], or [C][Y], and then it selects the corresponding values from columns [A][X], [B][X], or [C][X].
英文:
import pandas as pd
import numpy as np
level_2 = ['X', 'Y', 'X', 'Y', 'X', 'Y']
level_1 = ['A', 'A', 'B', 'B', 'C', 'C']
data = [['a1', 2, 'b1', 4, 'c1', 3], ['a2', 16, 'b2', 48, 'c2', 78], ['a3', 10, 'b3', 12, 'c3', 34], ['a4', 114, 'b4', 6, 'c4', 1]]
columns = pd.MultiIndex.from_tuples(list(zip(level_1, level_2)))
df = pd.DataFrame(data, columns=columns)
I'm very new to python, so apologies for the basic nature of the question. I have the above dataframe. I would like to create another 2 columns based on the 2 greatest numbers in column [A][Y] or [B][Y] or [C][Y] and then select the corresponding column [A][X] or [B][X] or [C][X]. Any help would be greatly appreciated.
I've tried argsort but haven't been able to figure out how to reference the correct corresponding column.
答案1
得分: 2
也许不是最漂亮的解决方案,但它完成了任务(主要函数是 Series.nlargest
):
def fn(x):
x = x.nlargest(2)
a, b = x
ia, ib = x.index
return {
("Greatest1", "X"): f"{df_x.loc[x.name, ia]}",
("Greatest1", "Y"): a,
("Greatest2", "X"): f"{df_x.loc[x.name, ib]}",
("Greatest2", "Y"): b,
}
df_x = df.xs("X", axis=1, level=1)
x = df.xs("Y", axis=1, level=1).apply(fn, axis=1, result_type="expand")
df = pd.concat([df, x], axis=1)
print(df)
输出:
A B C Greatest1 Greatest2
X Y X Y X Y X Y X Y
0 a1 2 b1 4 c1 3 b1 4 c1 3
1 a2 16 b2 48 c2 78 c2 78 b2 48
2 a3 10 b3 12 c3 34 c3 34 b3 12
3 a4 114 b4 6 c4 1 a4 114 b4 6
英文:
Maybe not the prettiest solution but it gets job done (the main function is Series.nlargest
):
def fn(x):
x = x.nlargest(2)
a, b = x
ia, ib = x.index
return {
("Greatest1", "X"): f"{df_x.loc[x.name, ia]}",
("Greatest1", "Y"): a,
("Greatest2", "X"): f"{df_x.loc[x.name, ib]}",
("Greatest2", "Y"): b,
}
df_x = df.xs("X", axis=1, level=1)
x = df.xs("Y", axis=1, level=1).apply(fn, axis=1, result_type="expand")
df = pd.concat([df, x], axis=1)
print(df)
Prints:
A B C Greatest1 Greatest2
X Y X Y X Y X Y X Y
0 a1 2 b1 4 c1 3 b1 4 c1 3
1 a2 16 b2 48 c2 78 c2 78 b2 48
2 a3 10 b3 12 c3 34 c3 34 b3 12
3 a4 114 b4 6 c4 1 a4 114 b4 6
答案2
得分: 2
以下是您要翻译的内容:
Another possible option :
NBOG = 2
lvl1 = df.columns.levels[1]
lst2 = [[f"Greatest{i+1}" for i in range(NBOG)], lvl1]
arrg = (
df.stack(0).set_index(lvl1[0], append=True)
.groupby(level=0, group_keys=False)[lvl1[1]]
.nlargest(NBOG).droplevel(0).reset_index(level=1)
.to_numpy().reshape(-1, len(lvl1)*NBOG)
)
out = df.join(pd.DataFrame(arrg, columns=pd.MultiIndex.from_product(lst2)))
Ouptut :
print(out)
A B C Greatest1 Greatest2
X Y X Y X Y X Y X Y
0 a1 2 b1 4 c1 3 b1 4 c1 3
1 a2 16 b2 48 c2 78 c2 78 b2 48
2 a3 10 b3 12 c3 34 c3 34 b3 12
3 a4 114 b4 6 c4 1 a4 114 b4 6
英文:
Another possible option :
NBOG = 2
lvl1 = df.columns.levels[1]
lst2 = [[f"Greatest{i+1}" for i in range(NBOG)], lvl1]
arrg = (
df.stack(0).set_index(lvl1[0], append=True)
.groupby(level=0, group_keys=False)[lvl1[1]]
.nlargest(NBOG).droplevel(0).reset_index(level=1)
.to_numpy().reshape(-1, len(lvl1)*NBOG)
)
out = df.join(pd.DataFrame(arrg, columns=pd.MultiIndex.from_product(lst2)))
Ouptut :
print(out)
A B C Greatest1 Greatest2
X Y X Y X Y X Y X Y
0 a1 2 b1 4 c1 3 b1 4 c1 3
1 a2 16 b2 48 c2 78 c2 78 b2 48
2 a3 10 b3 12 c3 34 c3 34 b3 12
3 a4 114 b4 6 c4 1 a4 114 b4 6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论