如何使用Python在Excel中将列分割为两个子列,放在其父列下。

huangapple go评论65阅读模式
英文:

How to split columns into two sub columns under their parent column using python in excel

问题

Here's the translated code part:

需要Python中的建议
我有一个包含20列值的Excel表格值的格式类似于12-->23我需要将所有列拆分为前后两部分前部分将包含值12后部分将包含值23以此类推...而且这些列应该正好位于其各自的父列下使用pandas

输入

    | 列 A |
    | -------- |
    | 12 --> 23|
    | 13 --> 24|

输出

    | 列 A |
    |pre| |post|
    | 12| | 23 |
    | 13| | 24 |

有很多列所以我不能直接使用列名

我尝试了以下代码如果我知道列名它可以工作但如果我必须循环处理没有列名的列它不起作用

```python
df = pd.read_excel('路径/到/excel文件.xlsx')

for col in df.columns:
    new_cols = df[col].str.split(expand=True)
    
    df[col + '_1'] = new_cols[0]
    df[col + '_2'] = new_cols[1]
df.drop(columns=df.columns[:-2], inplace=True)

对我来说不起作用


If you have any more questions or need further assistance, feel free to ask.

<details>
<summary>英文:</summary>

Need suggestion in python.
I have an excel sheet with 20 columns having values like 12--&gt;23 i need to split all the columns into two pre and post where pre will have value 12 and post will have value 23 and so on ... and also these column should be exactly under their respective  parent column using pandas

Input:

    | Column A | 
    | -------- | 
    | 12 --&gt; 23|
    | 13 --&gt; 24|

Output

    | column A |
    |pre| |post|
    | 12| | 23 |
    | 13| | 24 |

There are lots of column so i can&#39;t  directly  use column name 


I tried the following  code it worked if i know the column  name but not if i have to loop the columns without their name

df = pd.read_excel('path/to/excel_file.xlsx')

for col in df.columns:
new_cols = df[col].str.split(expand=True)

df[col + &#39;_1&#39;] = new_cols[0]
df[col + &#39;_2&#39;] = new_cols[1]

df.drop(columns=df.columns[:-2], inplace=True)

It&#39;s  not working for me

</details>


# 答案1
**得分**: 2

你可以使用 [`str.split`][1] 函数:

```python
df[['pre', 'post']] = df['Column A'].str.split(r'\s*-->\s*', expand=True)
print(df)

# 输出结果
    Column A pre post
0  12 --> 23  12   23
1  13 --> 24  13   24

对于多列的情况,可以使用以下代码:

data = []

for col in df.columns:
    new_cols = df[col].str.split(r'\s*-->\s*', expand=True)
    if new_cols.shape[1] == 2:
        df = df.drop(columns=col)
        data.append(new_cols.add_prefix(col + '_'))
df = pd.concat([df, *data], axis=1)

输出结果:

>>> df
  Column A_0 Column A_1
0         12         23
1         13         24
英文:

You can use str.split:

df[[&#39;pre&#39;, &#39;post&#39;]] = df[&#39;Column A&#39;].str.split(r&#39;\s*--&gt;\s*&#39;, expand=True)
print(df)

# Output
    Column A pre post
0  12 --&gt; 23  12   23
1  13 --&gt; 24  13   24

So for multiple columns:

data = []

for col in df.columns:
    new_cols = df[col].str.split(r&#39;\s*--&gt;\s*&#39;, expand=True)
    if new_cols.shape[1] == 2:
        df = df.drop(columns=col)
        data.append(new_cols.add_prefix(col + &#39;_&#39;))
df = pd.concat([df, *data], axis=1)

Output:

&gt;&gt;&gt; df
  Column A_0 Column A_1
0         12         23
1         13         24

huangapple
  • 本文由 发表于 2023年4月6日 22:55:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/75950933.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定