英文:
Sort Pandas DataFrame based on previous row in another column
问题
在我的Python项目中,我有以下DataFrame:
df1 = pd.DataFrame({"Col A":[1,2,3],"Col B":[3,2,2]})
我希望按照以下方式对其进行排序:
df2 = pd.DataFrame({"Col A":[1,3,2],"Col B":[3,2,2]})
我的目标是使Col A
中的每个值与Col B
中的前一个值匹配。
你有没有任何想法如何使这个工作正常,而且尽量减少工作量?
我尝试使用.sort_values(by=)
,但这也是我的当前知识的限制。
英文:
I have the following DataFrame in my Python porject:
df1 = pd.DataFrame({"Col A":[1,2,3],"Col B":[3,2,2]})
I wish to order it in this kind of way:
df2 = pd.DataFrame({"Col A":[1,3,2],"Col B":[3,2,2]})
My goal is that each value in Col A
matches the previous' value in Col B
.
Do you have any idea of how to make this work properly and as little effort as possible?
I tried to work with .sort_values(by=)
but that's also where my current knowledge stops.
答案1
得分: 1
如果需要对Col B
每个值进行滚动操作,可以使用 lambda 函数:
df1 = pd.DataFrame({"Col A":[1,2,3,7,4,8],"Col B":[3,2,2,1,1,1]})
print (df1)
Col A Col B
0 1 3
1 2 2
2 3 2
3 7 1
4 4 1
5 8 1
df1['Col A'] = df1.groupby('Col B')['Col A'].transform(lambda x: np.roll(x, -1))
print (df1)
Col A Col B
0 1 3
1 3 2
2 2 2
3 4 1
4 8 1
5 7 1
英文:
If need roll one value per Col B
use lambda function:
df1 = pd.DataFrame({"Col A":[1,2,3,7,4,8],"Col B":[3,2,2,1,1,1]})
print (df1)
Col A Col B
0 1 3
1 2 2
2 3 2
3 7 1
4 4 1
5 8 1
df1['Col A'] = df1.groupby('Col B')['Col A'].transform(lambda x: np.roll(x, -1))
print (df1)
Col A Col B
0 1 3
1 3 2
2 2 2
3 4 1
4 8 1
5 7 1
答案2
得分: 0
是的,您可以使用sort_values()
和创建映射字典来实现所需的输出,示例如下:
import pandas as pd
df1 = pd.DataFrame({"Col A":[1,2,3],"Col B":[3,2,2]})
# 用于排序的映射字典
mapping_dict = {1:3, 3:2, 2:2}
df1["sort_order"] = df1["Col A"].map(mapping_dict)
df2 = df1.sort_values(by="sort_order").drop(columns=["sort_order"])
print(df2)
输出结果:
Col A Col B
0 1 3
2 3 2
1 2 2
英文:
Yes, you can achieve the desired output by using sort_values()
and by creating a mapping dictionary so:
import pandas as pd
df1 = pd.DataFrame({"Col A":[1,2,3],"Col B":[3,2,2]})
# mapping_dict for ordering
mapping_dict = {1:3, 3:2, 2:2}
df1["sort_order"] = df1["Col A"].map(mapping_dict)
df2 = df1.sort_values(by="sort_order").drop(columns=["sort_order"])
print(df2)
Output:
Col A Col B
0 1 3
2 3 2
1 2 2
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论