英文:
How to str split and explode on multiple columns?
问题
df2 =(df.set_index(['Name','Source'])
.apply(lambda x: x.str.split(';').explode())
.reset_index())
英文:
Is there a way to split and explode on multiple columns?
This maybe a basic task but I am drawing a blank currently,
My pandas dataframe:
Name | Title | City | Country | Source |
---|---|---|---|---|
Haliey Wells | Data Scientist; Data Analyst; Mathematician | Paris; Suva; Paris | France; FIJI; France | |
Bron Levy | Data Scientist; Data Analyst | HELSINKI; Berlin | Finland; Germany | Kaggle |
Grace Kalie | Data Analyst; Mathematician | Athens; Budapest | Greece; Hungary | Kaggle |
Evan James | ML Engineer; Developer | Tokyo; Lima | Japan; Peru |
Currently the code that I have will only work on one column at a time:
df2 =(df.set_index(['Name','Source']) #Columns that won't be touched by the index
.apply(lambda x: x.str.split(';').explode()) #split on the ; [its actually a pipe(|) but for visual purposes I used a ;]
.reset_index())
**Note: the code above generally works for me but I am usually using it on one column.
Desired output:
Name | Title | City | Country | Source |
---|---|---|---|---|
Haliey Wells | Data Scientist | Paris | France | |
Haliey Wells | Data Analyst | Suva | Fiji | |
Haliey Wells | Mathematician | Paris | France | |
Bron Levy | Data Scientist | HELSINKI | Finland | Kaggle |
Bron Levy | Data Analyst | Berlin | Germany | Kaggle |
Grace Kalie | Data Analyst | Athens | Greece | Kaggle |
Grace Kalie | Mathematician | Budapest | Hungary | Kaggle |
Evan James | ML Engineer | Tokyo | Japan | |
Evan James | Developer | Lima | Peru |
答案1
得分: 5
首先拆分,然后进行常规的多列展开。例如:
cols = ['Title', 'City', 'Country']
df.assign(**{c: df[c].str.split('; ') for c in cols}).explode(cols)
Name Title City Country Source
0 Haliey Wells Data Scientist Paris France Linkedin
0 Haliey Wells Data Analyst Suva FIJI Linkedin
0 Haliey Wells Mathematician Paris France Linkedin
1 Bron Levy Data Scientist HELSINKI Finland Kaggle
1 Bron Levy Data Analyst Berlin Germany Kaggle
2 Grace Kalie Data Analyst Athens Greece Kaggle
2 Grace Kalie Mathematician Budapest Hungary Kaggle
3 Evan James ML Engineer Tokyo Japan Google
3 Evan James Developer Lima Peru Google
<details>
<summary>英文:</summary>
Split first, then do a normal multi-column explode. For example:
cols = ['Title', 'City', 'Country']
df.assign(**{c: df[c].str.split('; ') for c in cols}).explode(cols)
Name Title City Country Source
0 Haliey Wells Data Scientist Paris France Linkedin
0 Haliey Wells Data Analyst Suva FIJI Linkedin
0 Haliey Wells Mathematician Paris France Linkedin
1 Bron Levy Data Scientist HELSINKI Finland Kaggle
1 Bron Levy Data Analyst Berlin Germany Kaggle
2 Grace Kalie Data Analyst Athens Greece Kaggle
2 Grace Kalie Mathematician Budapest Hungary Kaggle
3 Evan James ML Engineer Tokyo Japan Google
3 Evan James Developer Lima Peru Google
</details>
# 答案2
**得分**: 2
以下是使用apply而不是字典推导和解包的另一种方法:
```python
df.set_index(['Name', 'Source']) \
.apply(lambda x: x.str.split(';')) \
.explode(column=df.columns[1:-1].tolist()).reset_index()
输出:
Name Source Title City Country
0 Haliey Wells Linkedin Data Scientist Paris France
1 Haliey Wells Linkedin Data Analyst Suva FIJI
2 Haliey Wells Linkedin Mathematician Paris France
3 Bron Levy Kaggle Data Scientist HELSINKI Finland
4 Bron Levy Kaggle Data Analyst Berlin Germany
5 Grace Kalie Kaggle Data Analyst Athens Greece
6 Grace Kalie Kaggle Mathematician Budapest Hungary
7 Evan James Google ML Engineer Tokyo Japan
8 Evan James Google Developer Lima Peru
英文:
Here's another way using apply instead of dictionary comprehension and unpacking:
df.set_index(['Name', 'Source'])\
.apply(lambda x: x.str.split(';'))\
.explode(column=df.columns[1:-1].tolist()).reset_index()
Output:
Name Source Title City Country
0 Haliey Wells Linkedin Data Scientist Paris France
1 Haliey Wells Linkedin Data Analyst Suva FIJI
2 Haliey Wells Linkedin Mathematician Paris France
3 Bron Levy Kaggle Data Scientist HELSINKI Finland
4 Bron Levy Kaggle Data Analyst Berlin Germany
5 Grace Kalie Kaggle Data Analyst Athens Greece
6 Grace Kalie Kaggle Mathematician Budapest Hungary
7 Evan James Google ML Engineer Tokyo Japan
8 Evan James Google Developer Lima Peru
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论