英文:
Pandas: How can I select every nth column including groups of columns that are less than n?
问题
我想选择一组包含5列的列和少于5列的列组,例如:
列1 | 列2 | 列3 | 列4 | 列5 | 列6 | 列7 |
---|---|---|---|---|---|---|
单元格1 | 单元格2 | 单元格3 | 单元格4 | 单元格5 | 单元格6 | 单元格7 |
单元格8 | 单元格9 | 单元格10 | 单元格11 | 单元格12 | 单元格13 | 单元格14 |
数据框将始终添加新列,这就是为什么我要循环求列的总和。进一步解释,我想对列1至列5求和,创建一个名为"列和1"的新列,对列6和列7求和,创建一个名为"列和2"的新列。
我尝试过使用
loc[1:5].apply(np.sum, axis=1)
如果列组恰好为5列,则它可以工作,但如果列组少于5列,则它会返回NaN,而不是最后几列的总和。
英文:
I want to select groups of 5 columns and column groups that are less than 5, for example:
Column 1 | Column 2 | Column 3 | Column 4 | Column 5 | Column 6 | Column 7 |
---|---|---|---|---|---|---|
Cell 1 | Cell 2 | Cell 3 | Cell 4 | Cell 5 | Cell 6 | Cell 7 |
Cell 8 | Cell 9 | Cell 10 | Cell 11 | Cell 12 | Cell 13 | Cell 14 |
The dataframe will always add new columns so this is why I want to make a loop for the summation of the columns. So to explain further, I want to take the sum of Column 1 - Column 5 and create a new column called "Column Sum 1", and the sum of Column 6 and Column 7 as "Column Sum 2".
I tried working with
loc[1:5].apply(np.sum, axis=1)
and it works if the column group is exactly 5, however if the column group is less than 5, then it returns NaN instead of the summation of the last few columns.
答案1
得分: 0
创建一个自定义的范围分组器,以沿列轴对数据框进行分组,然后使用 sum
进行聚合。
df.groupby(np.arange(df.shape[1]) // 5, axis=1).sum()
英文:
Create a custom range grouper to group the dataframe along column axis then aggregate with sum
df.groupby(np.arange(df.shape[1]) // 5, axis=1).sum()
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论