英文:
Pandas dataframe - what causes this error?
问题
这段代码中出现错误的原因是在使用store.select
方法时,你传递了一个where
参数,但是数据存储的格式是Fixed format(固定格式),而Fixed format存储不支持使用where
参数进行部分选择。你可以尝试以下方式来避免类似的错误:
- 不使用
where
参数: 如果你想选择整个数据集而不需要过滤数据,可以简单地调用store.select('obj2')
,不传递where
参数。
store.select('obj2')
- 使用Table格式存储: 如果你希望能够使用
where
参数进行选择,可以使用Table格式来存储数据,而不是Fixed格式。在存储数据时,将format
参数设置为'table'
:
store.put('obj2', frame, format='table')
然后,你就可以使用where
参数进行选择,就像你在代码中尝试的那样。
遵循这些建议,你就可以避免类似的错误,并根据需要轻松选择数据。
英文:
My code:
frame = pd.DataFrame({'a': np.random.randn(100)})
store = pd.HDFStore('mydata.h5')
store['obj1'] = frame
store['obj1_col'] = frame['a']
store.put('obj2',frame,foramt='table')
store.select('obj2',where=['index >= 10 and index <= 15'])
Gives this error message:
TypeError: cannot pass a where specification when reading from
a Fixed format store. this store must be selected in its entirety
Why does this code give this error if every piece of code is right? How do I avoid similar errors in the future?
答案1
得分: 0
(I wanted to comment, but I can't yet due to reputation...)
你好,这很有趣 - 由于某种原因,它在我的机器上运行。为了完整起见,我附上了代码(带有额外的导入)。
import pandas as pd
import numpy as np
frame = pd.DataFrame({'a': np.random.randn(100)})
store = pd.HDFStore('mydata.h5')
store['obj1'] = frame
store['obj1_col'] = frame['a']
store.put('obj2', frame, format='table')
store.select('obj2', where=['index >= 10 and index <= 15'])
返回
a
10 -0.049168
11 0.130048
12 -1.553641
13 -0.978392
14 0.723070
15 0.066814
您能否提供您正在使用的库的版本?我想知道我们是否可能使用了不同版本的库。
我有
import tables
import sys
print(sys.version) # 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
print(pd.__version__) # 1.5.0
print(np.__version__) # 1.23.3
print(tables.__version__) # 3.8.0 ... (这个是依赖项)
为了澄清 - 我有一个怀疑,它可能与 pytables
版本有关,可能与此答案中提到的相关答案有关。
您可以尝试升级 pytables
(例如通过 pip install --upgrade tables
)并重新运行一次吗?
英文:
(I wanted to comment, but I can't yet due to reputation...)
Hello, this is interesting -- for some reason it works on my machine. For the sake of completeness, I attach the code (with added imports).
import pandas as pd
import numpy as np
frame = pd.DataFrame({'a': np.random.randn(100)})
store = pd.HDFStore('mydata.h5')
store['obj1'] = frame
store['obj1_col'] = frame['a']
store.put('obj2',frame,format='table')
store.select('obj2',where=['index >= 10 and index <= 15'])
Returns
a
10 -0.049168
11 0.130048
12 -1.553641
13 -0.978392
14 0.723070
15 0.066814
Could you please mention the version of libraries you're using? I wonder if we might have different versions of libraries.
I have
import tables
import sys
print(sys.version) # 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
print(pd.__version__) # 1.5.0
print(np.__version__) # 1.23.3
print(tables.__version__) # 3.8.0 ... (this one is dependency)
To clarify -- I have a suspicion that it might be connected to pytables
version, as referred in this, possibly related answer.
Could you try upgrading pytables
(e.g. by pip install --upgrade tables
) and run again?
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论