2023年2月19日 12:07:49go评论108阅读模式

英文:

pandas .drop(columns=[]) is returning KeyError when columns are in the csv and dataframe

问题

我正在尝试从CSV导入市场数据以运行一些回测。

我写了以下代码：

import pandas as pd
import numpy as np
df = pd.read_csv("30mindata.csv")
df = df.drop(columns=['Volume', 'NumberOfTrades', 'BidVolume', 'AskVolume'])
print(df)

我遇到了以下错误：

KeyError: "['Volume', 'NumberOfTrades', 'BidVolume', 'AskVolume'] not found in axis"

当我移除包含 drop() 的那行代码时，DataFrame 打印如下：

            Date       Time     Open     High      Low     Last   Volume   NumberOfTrades   BidVolume   AskVolume
0      2018/2/18   14:00:00  2734.50  2741.00  2734.00  2739.75     5304             2787        2299        3005
1      2018/2/18   14:30:00  2739.75  2741.00  2739.25  2740.25     1402              815         648         754
2      2018/2/18   15:00:00  2740.25  2743.50  2739.25  2742.00     4536             2301        2074        2462
3      2018/2/18   15:30:00  2742.25  2744.75  2742.25  2744.00     4102             1826        1949        2153
4      2018/2/18   16:00:00  2744.00  2744.25  2742.25  2742.25     2492             1113        1551         941
...          ...        ...      ...      ...      ...      ...      ...              ...         ...         ...
59074  2023/2/17   10:30:00  4076.25  4088.00  4076.00  4086.50    92507            54379       44917       47590
59075  2023/2/17   11:00:00  4086.50  4090.50  4079.25  4081.00   107233            67968       55784       51449
59076  2023/2/17   11:30:00  4081.00  4090.50  4079.50  4088.25   171507            92705       86022       85485
59077  2023/2/17   12:00:00  4088.00  4089.00  4085.25  4086.00    41032            17210       21176       19856
59078  2023/2/17   12:30:00  4086.25  4088.00  4085.25  4085.75     5164             2922        2818        2346

我有另一个文件使用了这种 pd.read_csv() 和然后 df.drop(columns=[]) 的形式，工作得很好。我尝试了 df.loc[:, 'Volume']，但得到了相同的 KeyError，显示 Volume 未在轴上找到。我真的不明白为什么标签在数据框中没有，而不使用 .drop() 函数时可以正确输出。

英文:

I'm trying to import market data from a csv to run some backtests.

.drop(columns=[]) 在 CSV 和数据框中存在列时返回 KeyError。

I wrote the following code:

import pandas as pd
import numpy as np
df = pd.read_csv(&quot;30mindata.csv&quot;)
df = df.drop(columns=[&#39;Volume&#39;, &#39;NumberOfTrades&#39;, &#39;BidVolume&#39;, &#39;AskVolume&#39;])
print(df)

I'm getting the error:
> KeyError: "['Volume', 'NumberOfTrades', 'BidVolume', 'AskVolume'] not found in axis"

When I remove the line of code containing drop() the dataframe prints as follows:

            Date       Time     Open     High      Low     Last   Volume   NumberOfTrades   BidVolume   AskVolume
0      2018/2/18   14:00:00  2734.50  2741.00  2734.00  2739.75     5304             2787        2299        3005
1      2018/2/18   14:30:00  2739.75  2741.00  2739.25  2740.25     1402              815         648         754
2      2018/2/18   15:00:00  2740.25  2743.50  2739.25  2742.00     4536             2301        2074        2462
3      2018/2/18   15:30:00  2742.25  2744.75  2742.25  2744.00     4102             1826        1949        2153
4      2018/2/18   16:00:00  2744.00  2744.25  2742.25  2742.25     2492             1113        1551         941
...          ...        ...      ...      ...      ...      ...      ...              ...         ...         ...
59074  2023/2/17   10:30:00  4076.25  4088.00  4076.00  4086.50    92507            54379       44917       47590
59075  2023/2/17   11:00:00  4086.50  4090.50  4079.25  4081.00   107233            67968       55784       51449
59076  2023/2/17   11:30:00  4081.00  4090.50  4079.50  4088.25   171507            92705       86022       85485
59077  2023/2/17   12:00:00  4088.00  4089.00  4085.25  4086.00    41032            17210       21176       19856
59078  2023/2/17   12:30:00  4086.25  4088.00  4085.25  4085.75     5164             2922        2818        2346

I have another file that uses this exact form of pd.read_csv() and then df.drop(columns=[]) which works just fine. I tried df.loc[:, 'Volume'] and got the same KeyError saying 'Volume' was not found in the axis. I really don't understand how the labels aren't in the dataframe when they get output correctly without the .drop() function

答案1

得分: 1

很可能是您的列名称中存在空格。

尝试通过执行以下操作删除这些空格...

import pandas as pd
df = pd.read_csv("30mindata.csv")
df.columns = [col.strip() for col in df.columns]

然后尝试像以前一样删除列。

英文:

It's very likely that you have blank spaces in the name of your columns.

Try removing those spaces doing this...

import pandas as pd
df = pd.read_csv(&quot;30mindata.csv&quot;)
df.columns = [col.strip() for col in df.columns]

Then try to drop the columns as before

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

.drop(columns=[]) 在 CSV 和数据框中存在列时返回 KeyError。

问题

答案1

Python Tkinter：使用Pack()为Canvas添加水平滚动条

为什么它不首先打印 “hello”？

Python字典筛选仅返回第一个有效出现。

在轴值之间放置标签并添加第二个Y轴

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。