英文:
Pandas- group string values
问题
以下是您要翻译的内容:
I have following dataset:
df = pd.DataFrame({"booking": ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C'],
"city": ['a', 'b', 'c', 'c', 'a', 'a', 'b', 'b', 'c', 'd', 'e'],
"orders": [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10],})
Need to group values to see: what is the sum of orders for each city, I think the best way is to create new DataFrame with two columns:
- list of cities
- list with SUM of order to each city
If with 2nd point is clear:
new_df1=df.groupby(['booking', 'city'])['orders'].apply(sum).groupby(level=0).apply(list)
print(new_df1)
But with 1st point is not clear, need to get list of unique cities for each booking, like:
needed_df = pd.DataFrame({"booking": ['A', 'B', 'C'],
"city_list": [['a', 'b', 'c'], ['a', 'b'], ['c', 'd', 'e']]})
英文:
I have following dataset:
df = pd.DataFrame({"booking": ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C'],
"city": ['a', 'b', 'c', 'c', 'a', 'a', 'b', 'b', 'c', 'd', 'e'],
"orders": [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10],})
Need to group values to see: what is the sum of orders for each city, I think the best way is to create new DataFrame with two columns:
- list of cities
- list with SUM of order to each city
If with 2nd point is clear:
new_df1=df.groupby(['booking', 'city'])['orders'].apply(sum).groupby(level=0).apply(list)
print(new_df1)
But with 1st point is not clear, need to get list of unique cities for each booking, like:
needed_df = pd.DataFrame({"booking": ['A', 'B', 'C'],
"city_list": [['a', 'b', 'c'], ['a', 'b'], ['c', 'd', 'e']]})
答案1
得分: 1
尝试使用 unique
out = df.groupby('booking')['city'].agg(pd.unique).reset_index()
Out[7]:
booking city
0 A [a, b, c]
1 B [a, b]
2 C [c, d, e]
英文:
Try with unique
out = df.groupby('booking')['city'].agg(pd.unique).reset_index()
Out[7]:
booking city
0 A [a, b, c]
1 B [a, b]
2 C [c, d, e]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论