Key error "not in index" when sum a list of columns in pandas

huangapple go评论98阅读模式
英文:

Key error "not in index" when sum a list of columns in pandas

问题

"Key error 'not in index' when sum a list of columns in pandas"

我有一个包含大约100列的数据框。我想要创建一个新的列,其中包含我要对某些列求和的结果,而我有一个包含要求和的列名的列表。
例如,我的列是:

  1. 'Col1', 'Col2', 'Col3', ... , 'Col100'

而我的要求和的列的列表是:

  1. sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']

要求和的列的列表可能会变化,因此我不能像这样操作:

  1. df['total'] = df['col2'] + df['col17'] + df['col44']

我的代码如下:

  1. df['total'] = df[sumlist].sum(axis=1)

但是我遇到了一个错误,说我的数据框中的所有列都不在索引中:

KeyError: "['Col1', 'Col2', 'Col3', ... , 'Col100'] not in index"

然后我尝试重置索引:df.reset_index(),但没有起作用。

英文:

Key error "not in index" when sum a list of columns in pandas

I have a dataframe with about 100 columnns. I want to create a new column with sum of some my columns and i have a list with names of columns to sum.
For example, my columns are:

  1. 'Col1', 'Col2', 'Col3', ... , 'Col100'

and my list of columns to sum is:

  1. sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']

List of columns to sum can change so i can't do something like this:

  1. df['total'] = df['col2'] + df['col17'] + df['col44']

My code:

  1. df['total'] = df[sumlist].sum(axis=1)

But i have an error all my columns in dataframe are not in index:

>KeyError: "['Col1', 'Col2', 'Col3', ... , 'Col100'] not in index"

Then i try to reset index: df.reset_index() but it doesnt work.

答案1

得分: 1

使用 Index.intersection:

  1. df = pd.DataFrame({'col2':[5,8],
  2. 'col17':[1,2],
  3. 'col44':[7,0]})
  4. sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']
  5. df['total'] = df[df.columns.intersection(sumlist)].sum(axis=1)
  6. print (df)
  7. col2 col17 col44 total
  8. 0 5 1 7 13
  9. 1 8 2 0 10

另一个想法 - 也许需要大写列名?

  1. sumlist = ['Col2', 'Col17', 'Col44', 'Col63', 'Col72', 'Col21', 'Col95']
英文:

Use Index.intersection:

  1. df = pd.DataFrame({'col2':[5,8],
  2. 'col17':[1,2],
  3. 'col44':[7,0]})
  4. sumlist = ['col2', 'col17', 'col44', 'col63', 'col72', 'col21', 'col95']
  5. df['total'] = df[df.columns.intersection(sumlist)].sum(axis=1)
  6. print (df)
  7. col2 col17 col44 total
  8. 0 5 1 7 13
  9. 1 8 2 0 10

Another idea - maybe need capitalize columns names?

  1. sumlist = ['Col2', 'Col17', 'Col44', 'Col63', 'Col72', 'Col21', 'Col95']

huangapple
  • 本文由 发表于 2023年6月5日 13:24:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76403669.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定