Pandas: 如何选择每第n列,包括小于n的列组?

huangapple go评论68阅读模式
英文:

Pandas: How can I select every nth column including groups of columns that are less than n?

问题

我想选择一组包含5列的列和少于5列的列组,例如:

列1 列2 列3 列4 列5 列6 列7
单元格1 单元格2 单元格3 单元格4 单元格5 单元格6 单元格7
单元格8 单元格9 单元格10 单元格11 单元格12 单元格13 单元格14

数据框将始终添加新列,这就是为什么我要循环求列的总和。进一步解释,我想对列1至列5求和,创建一个名为"列和1"的新列,对列6和列7求和,创建一个名为"列和2"的新列。

我尝试过使用

loc[1:5].apply(np.sum, axis=1) 

如果列组恰好为5列,则它可以工作,但如果列组少于5列,则它会返回NaN,而不是最后几列的总和。

英文:

I want to select groups of 5 columns and column groups that are less than 5, for example:

Column 1 Column 2 Column 3 Column 4 Column 5 Column 6 Column 7
Cell 1 Cell 2 Cell 3 Cell 4 Cell 5 Cell 6 Cell 7
Cell 8 Cell 9 Cell 10 Cell 11 Cell 12 Cell 13 Cell 14

The dataframe will always add new columns so this is why I want to make a loop for the summation of the columns. So to explain further, I want to take the sum of Column 1 - Column 5 and create a new column called "Column Sum 1", and the sum of Column 6 and Column 7 as "Column Sum 2".

I tried working with

loc[1:5].apply(np.sum, axis=1) 

and it works if the column group is exactly 5, however if the column group is less than 5, then it returns NaN instead of the summation of the last few columns.

答案1

得分: 0

创建一个自定义的范围分组器,以沿列轴对数据框进行分组,然后使用 sum 进行聚合。

df.groupby(np.arange(df.shape[1]) // 5, axis=1).sum()
英文:

Create a custom range grouper to group the dataframe along column axis then aggregate with sum

df.groupby(np.arange(df.shape[1]) // 5, axis=1).sum()

huangapple
  • 本文由 发表于 2023年2月19日 00:59:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/75494899.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定