英文:
Sorting in Pandas Dataframe
问题
我理解主要的排序是在列名为 b 的列上进行的,但即使在传递了要排序的列名列表后,为什么不会相应地在列 a 上执行二次排序呢?
我在这里理解错了吗?
英文:
Suppose I create a pandas data frame as :
my_newobj = pd.DataFrame({'b': [4, 7, -3, 2], 'a': [0, 4, 7, 1]})
And I try to pass a list to the by
parameter for sorting values in each column name as such:
my_newobj.sort_values(by=['b', 'a'])
I understand the primary sorting done on column name b but why isn't secondary sorting performed on column a accordingly as well even after passing a list of column names to sort on?
Am I understanding something wrong here?
答案1
得分: 1
以下是您要的翻译内容:
对于您的数据,第一列永远不会有重复(换句话说,第一列中的值是唯一的),因此不需要关心第二列,请考虑以下示例:
import pandas as pd
df = pd.DataFrame({"b":[1,1,1,0,0,0],"a":[1,7,3,5,3,2]})
print(df.sort_values(by=['b', 'a']))
输出结果如下:
b a
5 0 2
4 0 3
3 0 5
0 1 1
2 1 3
1 1 7
英文:
For your data there is never tie in 1st column (in other words values in 1st column are unique), so no need to care about 2nd column, consider following example
import pandas as pd
df = pd.DataFrame({"b":[1,1,1,0,0,0],"a":[1,7,3,5,3,2]})
print(df.sort_values(by=['b', 'a']))
gives output
b a
5 0 2
4 0 3
3 0 5
0 1 1
2 1 3
1 1 7
答案2
得分: 0
以下是翻译好的代码部分:
import pandas as pd
my_newobj = pd.DataFrame({'b': [4, 7, -3, 2], 'a': [0, 4, 7, 1]})
分离数据框并删除索引值:
a = my_newobj['a'].sort_values().reset_index(drop=True)
b = my_newobj['b'].sort_values().reset_index(drop=True)
重新合并:
final = pd.DataFrame()
final['b'] = b
final['a'] = a
英文:
This is a little over complicated but it works:
import pandas as pd
my_newobj = pd.DataFrame({'b': [4, 7, -3, 2], 'a': [0, 4, 7, 1]})
Separate the dataframes and drop the index values:
a = my_newobj['a'].sort_values().reset_index(drop=True)
b = my_newobj['b'].sort_values().reset_index(drop=True)
# my_newobj['b'].sort_values(inplace=False).reset_index(drop=True)
Rejoin:
final = pd.DataFrame()
final['b'] = b
final['a'] = a
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论