英文:
How do you concatenate columns from an excel file with the same name on the 0 axis using python?
问题
我是Python初学者。经过几个小时的搜索,我找不到解决我的问题的方法。
我有超过2500列,列名分别为'Left pedal torque'
、'Right pedal torque'
和'Delta time'
,我想将它们合并成只有3列。我希望相同名称的列被连接在一起,成为一个列。以下是基本Excel文件的摘录和我想在Python中得到的最终文件的截图。
起始文件:
期望的最终文件:
非常感谢。
英文:
I am a beginner in Python. After hours of searching, I can't find the solution to my problem.
I have more than 2500 columns named 'Left pedal torque', 'Right pedal torque', 'Delta time'
that I would like to combine into only 3 columns. I would like the columns of the same name to be concatenated end to end into one column. Here is a screenshot of an excerpt from the basic excel file and the resulting file I would like to have in Python.
Start file :
Desired final file :
Thank you very much.
答案1
得分: 1
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
.set_axis(df.columns.str.split(".", expand=True), axis=1)
.stack().sort_index(level=[1,0], axis=0).reset_index(drop=True)
[["Left pedal torque", "Right pedal torque", "Delta time"]]
)
输出:
print(out)
Left pedal torque Right pedal torque Delta time
0 0 0 0
1 7039698124 271539712 0,021798503
2 6168807507 2788786173 0,003650174
3 5805936337 344928813 0,003649631
4 4717323303 344928813 0,003905165
5 4209303856 396301198 0,003708225
6 3556136131 4256568432 0,003690864
7 2467523098 4329957485 0,003893772
8 2104651928 4843681335 0,003684896
9 7547717571 2568618774 0,003764648
10 6749401093 2862175226 0,003915473
11 6023659229 3229120731 0,00358724
12 5515639782 3596066236 0,00358941
13 4427026749 3742844582 0,003833008
14 3846432924 4036400795 0,003631185
15 3265839338 4476735592 0,003626845
16 2394948721 4843681335 0,003851454
17 1886929393 5284015656 0,003651801
如果需要创建一个新工作表来存放结果数据帧,可以使用以下代码:
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
.set_axis(df.columns.str.split(".", expand=True), axis=1)
.stack()[["Left pedal torque", "Right pedal torque", "Delta time"]]
)
with pd.ExcelWriter("file.xlsx", mode="a",
engine="openpyxl", if_sheet_exists="overlay") as writer:
start_row = 0
for _, g in out.groupby(level=1, dropna=False, sort=False):
g.to_excel(writer, sheet_name="Sheet2", index=False, startrow=start_row)
start_row += len(g)
输出:
英文:
Here is one option with split
/stack
:
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
.set_axis(df.columns.str.split(".", expand=True), axis=1)
.stack().sort_index(level=[1,0], axis=0).reset_index(drop=True)
[["Left pedal torque", "Right pedal torque", "Delta time"]]
)
Output :
print(out)
Left pedal torque Right pedal torque Delta time
0 0 0 0
1 7039698124 271539712 0,021798503
2 6168807507 2788786173 0,003650174
3 5805936337 344928813 0,003649631
4 4717323303 344928813 0,003905165
5 4209303856 396301198 0,003708225
6 3556136131 4256568432 0,003690864
7 2467523098 4329957485 0,003893772
8 2104651928 4843681335 0,003684896
9 7547717571 2568618774 0,003764648
10 6749401093 2862175226 0,003915473
11 6023659229 3229120731 0,00358724
12 5515639782 3596066236 0,00358941
13 4427026749 3742844582 0,003833008
14 3846432924 4036400795 0,003631185
15 3265839338 4476735592 0,003626845
16 2394948721 4843681335 0,003851454
17 1886929393 5284015656 0,003651801
If you need to create a new sheet with the resulting dataframe, use this :
out = (pd.read_excel("file.xlsx", sheet_name="Sheet1")
.set_axis(df.columns.str.split(".", expand=True), axis=1)
.stack()[["Left pedal torque", "Right pedal torque", "Delta time"]]
)
with pd.ExcelWriter("file.xlsx", mode="a",
engine="openpyxl", if_sheet_exists="overlay") as writer:
start_row = 0
for _, g in out.groupby(level=1, dropna=False, sort=False):
g.to_excel(writer, sheet_name="Sheet2", index=False, startrow=start_row)
start_row += len(g)
Output :
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论