英文:
Join the columns and list them one by one in row
问题
order | content |
---|---|
a | c1 |
a | c2 |
a | c3 |
a | b1 |
a | b2 |
a | b3 |
b | c4 |
b | c5 |
b | c6 |
b | b4 |
b | b5 |
b | b6 |
英文:
I have Pandas dataset with orders and different types of packaging inside this order separated by comma inside the cell.
order | container | box |
---|---|---|
a | c1,c2,c3 | b1,b2,b3 |
b | c4,c5,c6 | b4,b5,b6 |
Need to get table with two columns: "order" and "content" with all values from both container and box.
I could only merge the container and box - but do not know how to list them row by row.
Needed table is:
order | content |
---|---|
a | c1 |
a | c2 |
a | c3 |
a | b1 |
a | b2 |
a | b3 |
b | c4 |
b | c5 |
b | c6 |
b | b4 |
b | b5 |
b | b6 |
答案1
得分: 1
以下是翻译好的部分:
你可以使用 stack
,split
+ explode
然后转换为 DataFrame:
out = (df.set_index('order').stack() # 将其他列设置在一边并堆叠
.str.split(',').explode() # 将值展开为多行
# 清理
.reset_index('order', name='content').reset_index(drop=True)
)
print(out)
输出结果:
order content
0 a c1
1 a c2
2 a c3
3 a b1
4 a b2
5 a b3
6 b c4
7 b c5
8 b c6
9 b b4
10 b b5
11 b b6
使用 melt
的另一种方法:
(df.melt('order', value_name='content')
.assign(content=lambda d: d['content'].str.split(','))
.explode('content').drop(columns='variable')
)
英文:
You can stack
, split
+explode
and convert to DataFrame:
out = (df.set_index('order').stack() # set other columns aside and stack
.str.split(',').explode() # expand values to multiple rows
# cleanup
.reset_index('order', name='content').reset_index(drop=True)
)
print(out)
Output:
order content
0 a c1
1 a c2
2 a c3
3 a b1
4 a b2
5 a b3
6 b c4
7 b c5
8 b c6
9 b b4
10 b b5
11 b b6
Alternative with melt
:
(df.melt('order', value_name='content')
.assign(content=lambda d: d['content'].str.split(','))
.explode('content').drop(columns='variable')
)
答案2
得分: 1
你还可以使用pd.DataFrame.melt
,pd.DataFrame.set_index
,pd.DataFrame.pipe
,pd.DataFrame.sort_index
来完成相同的操作:
(df
.melt(id_vars='order', value_vars=['container', 'box'])
.set_index('order')
.pipe(lambda x: x['value'].str.split(',').explode())
.sort_index()
)
或者更简洁的方法:
(df
.melt(id_vars='order', value_vars=['container', 'box'])
.set_index('order')['value'].str.split(',').explode()
.sort_index())
英文:
You could also use pd.DataFrame.melt
, pd.DataFrame.set_index
, pd.DataFrame.pipe
, pd.DataFrame.sort_index
:
(df
.melt(id_vars='order', value_vars=['container', 'box'])
.set_index('order')
.pipe(lambda x: x['value'].str.split(',').explode())
.sort_index()
)
order
a c1
a c2
a c3
a b1
a b2
a b3
b c4
b c5
b c6
b b4
b b5
b b6
or even a more concise approach::
(df
.melt(id_vars='order', value_vars=['container', 'box'])
.set_index('order')['value'].str.split(',').explode()
.sort_index())
答案3
得分: 0
以下是翻译好的部分:
import pandas as pd
df = pd.DataFrame({"order":["a","b"],"container":["c1,c2,c3","c4,c5,c6"],"box":["b1,b2,b3","b4,b5,b6"]})
df2 = pd.concat([df.order,df.container.str.split(",",expand=True),df.box.str.split(",",expand=True)],axis=1)
df2 = df2.melt("order")[["order","value"]]
print(df2)
输出:
order value
0 a c1
1 b c4
2 a c2
3 b c5
4 a c3
5 b c6
6 a b1
7 b b4
8 a b2
9 b b5
10 a b3
11 b b6
解释:使用 .str.split
和 expand=True
将逗号分隔的值分成单独的列,然后使用 .concat
创建包含这些列的 DataFrame,接着使用 .melt
在 "order" 列上获取所需的结果,最后选择所需的列。
英文:
I would do it following way
import pandas as pd
df = pd.DataFrame({"order":["a","b"],"container":["c1,c2,c3","c4,c5,c6"],"box":["b1,b2,b3","b4,b5,b6"]})
df2 = pd.concat([df.order,df.container.str.split(",",expand=True),df.box.str.split(",",expand=True)],axis=1)
df2 = df2.melt("order")[["order","value"]]
print(df2)
output
order value
0 a c1
1 b c4
2 a c2
3 b c5
4 a c3
5 b c6
6 a b1
7 b b4
8 a b2
9 b b5
10 a b3
11 b b6
Explanation: use .str.split
with expand=True
to get ,
-sheared values into separate columns, then use .concat
to create DataFrame
from them, then use .melt
at order
to get desired result and then select required columns.
答案4
得分: 0
df1.set_index("order").agg(','.join, axis=1).map(lambda x: x.split(',')).explode().rename("content")
输出:
order
a c1
a c2
a c3
a b1
a b2
a b3
b c4
b c5
b c6
b b4
b b5
b b6
英文:
df1.set_index("order").agg(','.join,axis=1).map(lambda x:x.split(",")).explode().rename("content")
out
order
a c1
a c2
a c3
a b1
a b2
a b3
b c4
b c5
b c6
b b4
b b5
b b6
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论