英文:
How to split columns into two sub columns under their parent column using python in excel
问题
Here's the translated code part:
需要Python中的建议。
我有一个包含20列值的Excel表格,值的格式类似于12-->23,我需要将所有列拆分为前后两部分,前部分将包含值12,后部分将包含值23,以此类推...而且这些列应该正好位于其各自的父列下,使用pandas
输入:
| 列 A |
| -------- |
| 12 --> 23|
| 13 --> 24|
输出
| 列 A |
|pre| |post|
| 12| | 23 |
| 13| | 24 |
有很多列,所以我不能直接使用列名
我尝试了以下代码,如果我知道列名,它可以工作,但如果我必须循环处理没有列名的列,它不起作用
```python
df = pd.read_excel('路径/到/excel文件.xlsx')
for col in df.columns:
new_cols = df[col].str.split(expand=True)
df[col + '_1'] = new_cols[0]
df[col + '_2'] = new_cols[1]
df.drop(columns=df.columns[:-2], inplace=True)
对我来说不起作用
If you have any more questions or need further assistance, feel free to ask.
<details>
<summary>英文:</summary>
Need suggestion in python.
I have an excel sheet with 20 columns having values like 12-->23 i need to split all the columns into two pre and post where pre will have value 12 and post will have value 23 and so on ... and also these column should be exactly under their respective parent column using pandas
Input:
| Column A |
| -------- |
| 12 --> 23|
| 13 --> 24|
Output
| column A |
|pre| |post|
| 12| | 23 |
| 13| | 24 |
There are lots of column so i can't directly use column name
I tried the following code it worked if i know the column name but not if i have to loop the columns without their name
df = pd.read_excel('path/to/excel_file.xlsx')
for col in df.columns:
new_cols = df[col].str.split(expand=True)
df[col + '_1'] = new_cols[0]
df[col + '_2'] = new_cols[1]
df.drop(columns=df.columns[:-2], inplace=True)
It's not working for me
</details>
# 答案1
**得分**: 2
你可以使用 [`str.split`][1] 函数:
```python
df[['pre', 'post']] = df['Column A'].str.split(r'\s*-->\s*', expand=True)
print(df)
# 输出结果
Column A pre post
0 12 --> 23 12 23
1 13 --> 24 13 24
对于多列的情况,可以使用以下代码:
data = []
for col in df.columns:
new_cols = df[col].str.split(r'\s*-->\s*', expand=True)
if new_cols.shape[1] == 2:
df = df.drop(columns=col)
data.append(new_cols.add_prefix(col + '_'))
df = pd.concat([df, *data], axis=1)
输出结果:
>>> df
Column A_0 Column A_1
0 12 23
1 13 24
英文:
You can use str.split
:
df[['pre', 'post']] = df['Column A'].str.split(r'\s*-->\s*', expand=True)
print(df)
# Output
Column A pre post
0 12 --> 23 12 23
1 13 --> 24 13 24
So for multiple columns:
data = []
for col in df.columns:
new_cols = df[col].str.split(r'\s*-->\s*', expand=True)
if new_cols.shape[1] == 2:
df = df.drop(columns=col)
data.append(new_cols.add_prefix(col + '_'))
df = pd.concat([df, *data], axis=1)
Output:
>>> df
Column A_0 Column A_1
0 12 23
1 13 24
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论