英文:
Python pandas sort_values not working properly
问题
当我尝试按列值对DataFrame进行排序并使用head()函数打印它时,它显示重复的行,而不是期望的结果。
regions = country_features['world_region']
happiness = []
counts = []
reg = []
for region in regions:
hap = country_features.loc[country_features['world_region'] == region, 'happiness_score'].mean()
count = len(country_features[country_features['world_region'] == region])
happiness.append(hap)
counts.append(count)
reg.append(region)
region_happiness = pd.DataFrame({'region': reg,
'happiness_score': happiness,
'country_count': counts})
region_happiness
region_happiness.happiness_score = pd.to_numeric(region_happiness.happiness_score)
sorted_df = region_happiness.sort_values(by='happiness_score', ascending=False)
sorted_df.head(5)
我想按列值对DataFrame进行排序,我希望它能正确排序。
英文:
When I try to sort DataFrame by column value and print it white head() function it shows duplicated rows instead of desired result
regions = country_features['world_region']
happines = []
counts = []
reg = []
for region in regions:
hap = country_features.loc[country_features['world_region'] == region, 'happiness_score'].mean()
count = len(country_features[country_features['world_region'] == region])
happines.append(hap)
counts.append(count)
reg.append(region)
region_happines = pd.DataFrame({'region':reg,
'happiness_score' : happines,
'country_count':counts})
region_happines
region_happines.happiness_score = pd.to_numeric(region_happines.happiness_score)
sorted = region_happines.sort_values(by='happiness_score', ascending=False)
sorted.head(5)
I want to sort DataFrame by column value and I expected it to be sorted correctly
答案1
得分: -1
第一部分的解决方案应该简化为:
print (country_features)
world_region happiness_score
0 reg1 5
1 reg1 1
2 reg2 10
3 reg2 1
4 reg2 3
region_happines = (country_features.groupby('world_region', as_index=False)
.agg(happiness_score=('happiness_score', 'mean'),
country_count=('happiness_score', 'size'))
.rename(columns={'world_region': 'region'}))
print (region_happines)
region happiness_score country_count
0 reg1 3.000000 2
1 reg2 4.666667 3
Because in column 'happiness_score' are averages per groups, not converted to numeric.
out = region_happines.sort_values(by='happiness_score', ascending=False)
英文:
First part of solution should be simplify:
print (country_features)
world_region happiness_score
0 reg1 5
1 reg1 1
2 reg2 10
3 reg2 1
4 reg2 3
region_happines = (country_features.groupby('world_region',as_index=False)
.agg(happiness_score= ('happiness_score','mean'),
country_count= ('happiness_score','size'))
.rename(columns={'world_region':'region'}))
print (region_happines)
region happiness_score country_count
0 reg1 3.000000 2
1 reg2 4.666667 3
Because in column happiness_score
are averages per groups, not converted to numeric.
out = region_happines.sort_values(by='happiness_score', ascending=False)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论