问题

无法使用numpy的split函数将数据框分配给子集。

英文:

Not able to use numpy split function to allocate subsets of dataframe to

cols =[&quot;fLength&quot;,&quot;fWidth&quot;,&quot;fSize&quot;,&quot;fConc&quot;,&quot;fConcl&quot;,&quot;fAsym&quot;,&quot;fM3Long&quot;,&quot;fAlpha&quot;,&quot;fDist&quot;,&quot;class&quot;]
df = pd.read_csv(&quot;magic04.data&quot;,names = cols)
df[&#39;class&#39;] = (df[&#39;class&#39;]==&#39;g&#39;).astype(int)

train, valid, test = np.split(df.sample(frac=1), [int(0.6*len(df)) , int(0.8*len(df)), ])

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.9/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3628             try:
-&gt; 3629                 return self._engine.get_loc(casted_key)
   3630             except KeyError as err:

17 frames
KeyError: 0

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.9/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3629                 return self._engine.get_loc(casted_key)
   3630             except KeyError as err:
-&gt; 3631                 raise KeyError(key) from err
   3632             except TypeError:
   3633                 # If we have a listlike key, _check_indexing_error will raise

Tried reading the documentation but didnt find anything useful.

答案1

得分: 0

你的代码错误在于尝试使用NumPy例程处理Pandas数据框。解决方法是将df.sample转换为NumPy数组，然后使用np.split()。

尝试这样做 - 在我的VSCode上运行得非常好：

npsample=np.array(df.sample(frac=1))
train, valid, test = np.split(npsample, [int(0.6*len(npdata)) , int(0.8*len(npdata)), ])

英文:

The error in your code is that you are trying to use a numpy routine with a pandas data frame. The best way to approach this is to convert your df.sample into a numpy array and then use np.split().

Try this - it runs perfectly well on my VSCode:

npsample=np.array(df.sample(frac=1))
train, valid, test = np.split(npsample, [int(0.6*len(npdata)) , int(0.8*len(npdata)), ])

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

无法在NumPy中拆分数据框。

问题

Not able to use numpy split function to allocate subsets of dataframe to

答案1

Python在多进程线程中使用全局变量

如何使用geoserver-restconfig python包创建一个覆盖范围存储？

Sphinx不会记录复杂的Enum类。

Python 父类数据访问继承

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论