英文:
How do you concatenate columns from an excel file with the same name on the 0 axis using python?
问题
我是Python初学者。经过几个小时的搜索,我找不到解决我的问题的方法。
我有超过2500列,列名分别为'Left pedal torque'、'Right pedal torque'和'Delta time',我想将它们合并成只有3列。我希望相同名称的列被连接在一起,成为一个列。以下是基本Excel文件的摘录和我想在Python中得到的最终文件的截图。
起始文件:

期望的最终文件:

非常感谢。
英文:
I am a beginner in Python. After hours of searching, I can't find the solution to my problem.
I have more than 2500 columns named 'Left pedal torque', 'Right pedal torque', 'Delta time' that I would like to combine into only 3 columns. I would like the columns of the same name to be concatenated end to end into one column. Here is a screenshot of an excerpt from the basic excel file and the resulting file I would like to have in Python.
Start file :

Desired final file :

Thank you very much.
答案1
得分: 1
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
            .set_axis(df.columns.str.split(".", expand=True), axis=1)
            .stack().sort_index(level=[1,0], axis=0).reset_index(drop=True)
            [["Left pedal torque", "Right pedal torque", "Delta time"]]
)
输出:
print(out)
    Left pedal torque  Right pedal torque   Delta time
0                   0                   0            0
1          7039698124           271539712  0,021798503
2          6168807507          2788786173  0,003650174
3          5805936337           344928813  0,003649631
4          4717323303           344928813  0,003905165
5          4209303856           396301198  0,003708225
6          3556136131          4256568432  0,003690864
7          2467523098          4329957485  0,003893772
8          2104651928          4843681335  0,003684896
9          7547717571          2568618774  0,003764648
10         6749401093          2862175226  0,003915473
11         6023659229          3229120731   0,00358724
12         5515639782          3596066236   0,00358941
13         4427026749          3742844582  0,003833008
14         3846432924          4036400795  0,003631185
15         3265839338          4476735592  0,003626845
16         2394948721          4843681335  0,003851454
17         1886929393          5284015656  0,003651801
如果需要创建一个新工作表来存放结果数据帧,可以使用以下代码:
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
            .set_axis(df.columns.str.split(".", expand=True), axis=1)
            .stack()[["Left pedal torque", "Right pedal torque", "Delta time"]]
)
with pd.ExcelWriter("file.xlsx", mode="a",
                     engine="openpyxl", if_sheet_exists="overlay") as writer:
    start_row = 0
    for _, g in out.groupby(level=1, dropna=False, sort=False):
        g.to_excel(writer, sheet_name="Sheet2", index=False, startrow=start_row)
        start_row += len(g)
输出:
英文:
Here is one option with split/stack :
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
            .set_axis(df.columns.str.split(".", expand=True), axis=1)
            .stack().sort_index(level=[1,0], axis=0).reset_index(drop=True)
            [["Left pedal torque", "Right pedal torque", "Delta time"]]
)
Output :
print(out)
    Left pedal torque  Right pedal torque   Delta time
0                   0                   0            0
1          7039698124           271539712  0,021798503
2          6168807507          2788786173  0,003650174
3          5805936337           344928813  0,003649631
4          4717323303           344928813  0,003905165
5          4209303856           396301198  0,003708225
6          3556136131          4256568432  0,003690864
7          2467523098          4329957485  0,003893772
8          2104651928          4843681335  0,003684896
9          7547717571          2568618774  0,003764648
10         6749401093          2862175226  0,003915473
11         6023659229          3229120731   0,00358724
12         5515639782          3596066236   0,00358941
13         4427026749          3742844582  0,003833008
14         3846432924          4036400795  0,003631185
15         3265839338          4476735592  0,003626845
16         2394948721          4843681335  0,003851454
17         1886929393          5284015656  0,003651801
If you need to create a new sheet with the resulting dataframe, use this :
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
            .set_axis(df.columns.str.split(".", expand=True), axis=1)
            .stack()[["Left pedal torque", "Right pedal torque", "Delta time"]]
)
with pd.ExcelWriter("file.xlsx", mode="a",
                     engine="openpyxl", if_sheet_exists="overlay") as writer:
    start_row = 0
    for _, g in out.groupby(level=1, dropna=False, sort=False):
        g.to_excel(writer, sheet_name="Sheet2", index=False, startrow=start_row)
        start_row += len(g)
Output :
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。



评论