“KeyError: ‘cut’ not found in axis”

huangapple go评论72阅读模式
英文:

KeyError: "['cut'] not found in axis"

问题

我正在处理成千上万行的数据,尝试缩小搜索特定谷物的范围。为了做到这一点,我有一个名为'Asset'的列,其中包含大约20个不同的值,我需要获得相邻列'Load'中所有行的总和。

我想要从我的数据集中剔除不必要的行。首先,我将所有额外的资产重新标记为'cut'(如下面的示例所示),以便我可以管理一个.drop命令。下面是代码:

df14['Asset'] = df14["Asset"].str.replace('BEANS', 'cut')
df14.drop("cut", axis=0)
set(df14['Asset'])

这是我收到的错误:

KeyError                                  Traceback (most recent call last)
<ipython-input-593-40006512df80> in <module>
----> 1 df14.drop("cut", axis=0)
      2 set(df14['Asset'])

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   4100             level=level,
   4101             inplace=inplace,
-> 4102             errors=errors,
   4103         )
   4104 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3912         for axis, labels in axes.items():
   3913             if labels is not None:
-> 3914                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3915 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors)
   3944                 new_axis = axis.drop(labels, level=level, errors=errors)
   3945             else:
-> 3946                 new_axis = axis.drop(labels, errors=errors)
   3947             result = self.reindex(**{axis_name: new_axis})
   3948 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors)
   5338         if mask.any():
   5339             if errors != "ignore":
-> 5340                 raise KeyError("{} not found in axis".format(labels[mask]))
   5341             indexer = indexer[~mask]
   5342         return self.delete(indexer)

KeyError: "['cut'] not found in axis"

我尝试了几个命令来删除这些行,如:

df14.drop(["cut"], inplace=True)

df14[~df14['Asset'].isin(to_drop)]

df14[df14['Asset'].str.contains('cut', na=True)]

所有这些命令都产生相同的结果。

当我编写

df14 = df14[~df14["Asset"].str.contains('BEANS')]

它不会从我的最终计算中移除'Load'数值,它是紧挨着的下一列。

是否可能删除所有带有特定标签的数据行,以便我可以将20个资产削减到7个资产?

谢谢

英文:

I am working with thousands of lines of data trying to narrow a search for certain grains. To do this, I have an 'Asset' column with about 20 different values, of which I need to receive the sum of all of the lines in the adjacent column 'Load'.

I would like to cut the unnecessary rows out of my data set. To start, I relabeled all of the extra assets as 'cut' (as shown in the example below) so that I could manage one .drop command. Here is how it is coded:

df14[&#39;Asset&#39;] = df14[&quot;Asset&quot;].str.replace(&#39;BEANS&#39;, &#39;cut&#39;)
df14.drop(&quot;cut&quot;, axis=0)
set(df14[&#39;Asset&#39;])

This is the error I have received:

KeyError                                  Traceback (most recent call last)
&lt;ipython-input-593-40006512df80&gt; in &lt;module&gt;
----&gt; 1 df14.drop(&quot;cut&quot;, axis=0)
      2 set(df14[&#39;Asset&#39;])

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   4100             level=level,
   4101             inplace=inplace,
-&gt; 4102             errors=errors,
   4103         )
   4104 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3912         for axis, labels in axes.items():
   3913             if labels is not None:
-&gt; 3914                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3915 
   3916         if inplace:

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _drop_axis(self, labels, axis, level, errors)
   3944                 new_axis = axis.drop(labels, level=level, errors=errors)
   3945             else:
-&gt; 3946                 new_axis = axis.drop(labels, errors=errors)
   3947             result = self.reindex(**{axis_name: new_axis})
   3948 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py in drop(self, labels, errors)
   5338         if mask.any():
   5339             if errors != &quot;ignore&quot;:
-&gt; 5340                 raise KeyError(&quot;{} not found in axis&quot;.format(labels[mask]))
   5341             indexer = indexer[~mask]
   5342         return self.delete(indexer)

KeyError: &quot;[&#39;cut&#39;] not found in axis&quot;

I have tried several commands to remove said lines, like:

df14.drop([&quot;cut&quot;], inplace = True) 

df14[~df14[&#39;Asset&#39;].isin(to_drop)]
df14[df14[&#39;Asset&#39;].str.contains(&#39;cut&#39;, na = True)]

And all of them yield the same fruits.

When I code

df14 = df14[~df14[&quot;Asset&quot;].str.contains(&#39;BEANS&#39;)]

It does not remove the Load number, which is the next column over, from my final calculations.

Is it possible to remove all rows of data with a certain label so I can trim from 20 assets to 7 assets?

Thank you

答案1

得分: 1

pd.drop 通过列或行方式工作。您提供列名称以删除列或索引以删除行。axis=0 表示按索引。由于您没有名为"cut"的索引,它会引发错误。

我建议这样做:

df = df.loc[df['Asset'] != 'cut']
英文:

pd.drop works by column or row wise. You give column name to drop a column or index to drop a row. Andaxis=0 means index-wise. Since you don't have a index named "cut", it gives the error.

I recommend doing it by:

df = df.loc[df[&#39;Asset&#39;] != &#39;cut&#39;]

答案2

得分: 0

I believe that df14.drop("cut", axis=0) 失败是因为它在df14的索引中寻找值"cut"。您可以考虑将资产列指定为索引,参见pandas文档中的drop方法,但我认为更好的解决方案可能是:

df14 = df14.query('asset != "cut"')

我不能确定这是否是最快的解决方案,因为我通常处理小型数据集,不太需要担心性能问题。

英文:

I believe that df14.drop("cut", axis=0) is failing because it is looking for the value "cut" in the index of df14. You could potentially specify the asset column as an index, see the pandas documentation on drop for how, but I think a better solution might be something along lines of

df14 = df14.query(&#39;asset != &quot;cut&quot;&#39;) 

I can't say I know if this is the fastest solution since I usually work with small-ish datasets I've not had to worry about performance too much.

答案3

得分: 0

这应该可以完成任务。
在这里,你基本上是选择了除了'cut'之外的所有行。

df14 = df14.loc[df14['Asset'] != 'cut']

英文:

This should do the job.
Here you are basically selecting all rows other than 'cut'

df14 = df14.loc[df14['Asset'] != 'cut']

huangapple
  • 本文由 发表于 2020年1月7日 01:33:37
  • 转载请务必保留本文链接:https://go.coder-hub.com/59616501.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定