英文:
How to alphabetically sort values in a pandas column that is a list
问题
I understand. Here's the translated content:
我有一个关于在数据框中对列表中的值进行排序的问题。我的销售订单数据有大约3000行。一个销售订单可以包含一个或多个库存项目。我对数据进行了重塑,并成功创建了一个类似这样的数据集:
当前结果:
销售订单 | 商品清单 |
---|---|
so-1 | 苹果, 梨, 番茄 |
so-2 | 面包, 鱼 |
so-3 | 鱼, 面包 |
so-4 | 梨, 番茄, 苹果 |
so-5 | 番茄 |
我想按行对每个'商品清单'的值按字母顺序排序,以便我可以将结果字符串分组、量化并在表格中可视化,以与我的数据框中的其他数据进行比较,但我在做这件事时遇到了困难。
我尝试了各种方法,如lambda函数和'sort',但这破坏了数据。我还尝试在商品清单列上使用'sort_values',在某些方面有所帮助,但没有按字母顺序排序列表中的项目。
销售订单 | 商品清单 |
---|---|
so-1 | 苹果, 梨, 番茄 |
so-2 | 面包, 鱼 |
so-3 | 面包, 鱼 |
so-4 | 苹果, 梨, 番茄 |
so-5 | 番茄 |
有没有办法遍历每个'商品清单'列的值并按字母顺序排序?
我尝试过:
z = df.copy()
z = z['商品清单'].sort_values()
这对我在这个问题中提供的示例数据有效,但只是因为数据过于简单化。在我的实际数据上却不起作用。
这是一个示例,sort_values没有对我的数据进行排序:
testdf = pd.DataFrame({'销售订单': ['so-1','so-2'], '库存项目': ['Revolve 3 Clean 110cm,Resolution 3 Glare - 120bp- Normal viscosity','Resolution 3 Glare - 120bp - Normal viscosity,Revolve 3 Clean 110cm']})
testdf['库存项目'].sort_values()
1 Resolution 3 Glare - 120bp - Normal viscosity,...
0 Revolve 3 Clean 110cm,Resolution 3 Glare - 120...
Name: 库存项目, dtype: object
谢谢阅读/帮助!
英文:
I had a question about sorting values in a list in a dataframe. I have sales order data with about 3000 rows. A sales order can contain one or multiple inventory items. I reshaped my data and was able to create a dataset like this:
Current result:
sales_order | bundle |
---|---|
so-1 | apple, pear, tomato |
so-2 | bread, fish |
so-3 | fish, bread |
so-4 | pear, tomato, apple |
so-5 | tomato |
I would like to sort each 'bundle' value alphabetically by row, so I can group the resulting string, quantify, and visualize in tableau against other data in my dataframe, but am having trouble doing so.
I tried various things like a lambda function and 'sort', but that ruined the data. I also tried using sort values on the bundle column, and this helped in certain ways, but did not sort the items in the list alphabetically.
sales_order | bundle |
---|---|
so-1 | apple, pear, tomato |
so-2 | bread, fish |
so-3 | bread, fish |
so-4 | apple, pear, tomato |
so-5 | tomato |
is there any way to parse through each 'bundle' column value and sort the value alphabetically?
I did try
z = df.copy()
z = z['bundle'].sort_values()
this works for my sample data I provided in this question, but only because the data is oversimplified. It doesnt work on my actual data.
here is an example where the sort_values is not sorting my data:
testdf = pd.DataFrame({'sales_order': ['so-1','so-2'],\
'Inv_item': ['Revolve 3 Clean 110cm,Resolution 3 Glare - 120bp- Normal viscosity',\
'Resolution 3 Glare - 120bp - Normal viscosity,Revolve 3 Clean 110cm']})
testdf['Inv_item'].sort_values()
1 Resolution 3 Glare - 120bp - Normal viscosity,...
0 Revolve 3 Clean 110cm,Resolution 3 Glare - 120...
Name: Inv_item, dtype: object
Thanks for reading/helping!
答案1
得分: 0
你可以使用以下代码对每个数值进行排序:
df['bundle'].str.split(',').map(sorted).str.join(',')
另一个选项是使用 frozenset
,它可以在 groupby()
中使用。它不会对你的值进行排序,但可能会很有用。
df['bundle'].str.split(',').map(frozenset)
输出:
0 apple, pear, tomato
1 bread, fish
2 bread, fish
3 apple, pear, tomato
4 tomato
英文:
You could use the below to sort each value:
df['bundle'].str.split(', ').map(sorted).str.join(', ')
Another option is to use frozenset
which can be used in groupby()
It wont sort your values, but could be useful.
df['bundle'].str.split(', ').map(frozenset)
Output:
0 apple, pear, tomato
1 bread, fish
2 bread, fish
3 apple, pear, tomato
4 tomato
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论