在一个 pandas 列中对值进行字母排序,该列是一个列表。

huangapple go评论68阅读模式
英文:

How to alphabetically sort values in a pandas column that is a list

问题

I understand. Here's the translated content:

我有一个关于在数据框中对列表中的值进行排序的问题。我的销售订单数据有大约3000行。一个销售订单可以包含一个或多个库存项目。我对数据进行了重塑,并成功创建了一个类似这样的数据集:

当前结果:

销售订单 商品清单
so-1 苹果, 梨, 番茄
so-2 面包, 鱼
so-3 鱼, 面包
so-4 梨, 番茄, 苹果
so-5 番茄

我想按行对每个'商品清单'的值按字母顺序排序,以便我可以将结果字符串分组、量化并在表格中可视化,以与我的数据框中的其他数据进行比较,但我在做这件事时遇到了困难。

我尝试了各种方法,如lambda函数和'sort',但这破坏了数据。我还尝试在商品清单列上使用'sort_values',在某些方面有所帮助,但没有按字母顺序排序列表中的项目。

销售订单 商品清单
so-1 苹果, 梨, 番茄
so-2 面包, 鱼
so-3 面包, 鱼
so-4 苹果, 梨, 番茄
so-5 番茄

有没有办法遍历每个'商品清单'列的值并按字母顺序排序?

我尝试过:

z = df.copy()
z = z['商品清单'].sort_values()

这对我在这个问题中提供的示例数据有效,但只是因为数据过于简单化。在我的实际数据上却不起作用。

这是一个示例,sort_values没有对我的数据进行排序:

testdf = pd.DataFrame({'销售订单': ['so-1','so-2'], '库存项目': ['Revolve 3 Clean 110cm,Resolution 3 Glare - 120bp- Normal viscosity','Resolution 3 Glare - 120bp - Normal viscosity,Revolve 3 Clean 110cm']})

testdf['库存项目'].sort_values()

1    Resolution 3 Glare - 120bp - Normal viscosity,...
0    Revolve 3 Clean 110cm,Resolution 3 Glare - 120...
Name: 库存项目, dtype: object

谢谢阅读/帮助!

英文:

I had a question about sorting values in a list in a dataframe. I have sales order data with about 3000 rows. A sales order can contain one or multiple inventory items. I reshaped my data and was able to create a dataset like this:

Current result:

sales_order bundle
so-1 apple, pear, tomato
so-2 bread, fish
so-3 fish, bread
so-4 pear, tomato, apple
so-5 tomato

I would like to sort each 'bundle' value alphabetically by row, so I can group the resulting string, quantify, and visualize in tableau against other data in my dataframe, but am having trouble doing so.

I tried various things like a lambda function and 'sort', but that ruined the data. I also tried using sort values on the bundle column, and this helped in certain ways, but did not sort the items in the list alphabetically.

sales_order bundle
so-1 apple, pear, tomato
so-2 bread, fish
so-3 bread, fish
so-4 apple, pear, tomato
so-5 tomato

is there any way to parse through each 'bundle' column value and sort the value alphabetically?

I did try
z = df.copy()
z = z['bundle'].sort_values()

this works for my sample data I provided in this question, but only because the data is oversimplified. It doesnt work on my actual data.

here is an example where the sort_values is not sorting my data:

    testdf = pd.DataFrame({'sales_order': ['so-1','so-2'],\
    'Inv_item': ['Revolve 3 Clean 110cm,Resolution 3 Glare - 120bp- Normal viscosity',\
    'Resolution 3 Glare - 120bp - Normal viscosity,Revolve 3 Clean 110cm']})

    testdf['Inv_item'].sort_values()

    1    Resolution 3 Glare - 120bp - Normal viscosity,...
    0    Revolve 3 Clean 110cm,Resolution 3 Glare - 120...
    Name: Inv_item, dtype: object

Thanks for reading/helping!

答案1

得分: 0

你可以使用以下代码对每个数值进行排序:

df['bundle'].str.split(',').map(sorted).str.join(',')

另一个选项是使用 frozenset,它可以在 groupby() 中使用。它不会对你的值进行排序,但可能会很有用。

df['bundle'].str.split(',').map(frozenset)

输出:

0    apple, pear, tomato
1            bread, fish
2            bread, fish
3    apple, pear, tomato
4                 tomato
英文:

You could use the below to sort each value:

df['bundle'].str.split(', ').map(sorted).str.join(', ')

Another option is to use frozenset which can be used in groupby() It wont sort your values, but could be useful.

df['bundle'].str.split(', ').map(frozenset)

Output:

0    apple, pear, tomato
1            bread, fish
2            bread, fish
3    apple, pear, tomato
4                 tomato

huangapple
  • 本文由 发表于 2023年3月31日 03:08:13
  • 转载请务必保留本文链接:https://go.coder-hub.com/75892079.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定