Key error "not in index" when sum a list of columns in pandas

huangapple go评论64阅读模式
英文:

Key error "not in index" when sum a list of columns in pandas

问题

"Key error 'not in index' when sum a list of columns in pandas"

我有一个包含大约100列的数据框。我想要创建一个新的列,其中包含我要对某些列求和的结果,而我有一个包含要求和的列名的列表。
例如,我的列是:

'Col1', 'Col2', 'Col3', ... , 'Col100'

而我的要求和的列的列表是:

sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']

要求和的列的列表可能会变化,因此我不能像这样操作:

df['total'] = df['col2'] + df['col17'] + df['col44']

我的代码如下:

df['total'] = df[sumlist].sum(axis=1)

但是我遇到了一个错误,说我的数据框中的所有列都不在索引中:

KeyError: "['Col1', 'Col2', 'Col3', ... , 'Col100'] not in index"

然后我尝试重置索引:df.reset_index(),但没有起作用。

英文:

Key error "not in index" when sum a list of columns in pandas

I have a dataframe with about 100 columnns. I want to create a new column with sum of some my columns and i have a list with names of columns to sum.
For example, my columns are:

'Col1', 'Col2', 'Col3', ... , 'Col100'

and my list of columns to sum is:

sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']

List of columns to sum can change so i can't do something like this:

df['total'] = df['col2'] + df['col17'] + df['col44']

My code:

df['total'] = df[sumlist].sum(axis=1)

But i have an error all my columns in dataframe are not in index:

>KeyError: "['Col1', 'Col2', 'Col3', ... , 'Col100'] not in index"

Then i try to reset index: df.reset_index() but it doesnt work.

答案1

得分: 1

使用 Index.intersection:

df = pd.DataFrame({'col2':[5,8],
                   'col17':[1,2],
                   'col44':[7,0]})

sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']

df['total'] = df[df.columns.intersection(sumlist)].sum(axis=1) 
print (df)
   col2  col17  col44  total
0     5      1      7     13
1     8      2      0     10

另一个想法 - 也许需要大写列名?

sumlist = ['Col2', 'Col17', 'Col44', 'Col63', 'Col72', 'Col21', 'Col95']
英文:

Use Index.intersection:

df = pd.DataFrame({'col2':[5,8],
                   'col17':[1,2],
                   'col44':[7,0]})

sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']

df['total'] = df[df.columns.intersection(sumlist)].sum(axis=1) 
print (df)
   col2  col17  col44  total
0     5      1      7     13
1     8      2      0     10

Another idea - maybe need capitalize columns names?

sumlist = ['Col2', 'Col17', 'Col44', 'Col63', 'Col72', 'Col21', 'Col95']

huangapple
  • 本文由 发表于 2023年6月5日 13:24:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76403669.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定