2023年7月10日 16:30:07go评论139阅读模式

英文:

Polars - ComputeError: cannot cast 'Object' type after conversion from Numpy Array

问题

我有一个 polars 数据帧，我使用 np.array_split 分割成多个帧。在分割和转换回 polars 数据帧后，所有列的数据类型都变为 'object'。当我尝试使用 cast() 更改数据类型时，我收到以下错误消息：

ComputeError: 无法将 'Object' 类型转换

我做错了什么？/如何修复这个问题？我需要将列的数据类型更改为不同的类型以进行进一步处理。

df = pl.DataFrame({
    'column1': ['2021-01-01', '2021-02-02', '2021-03-03'],
    'column2': ['value1', 'value2', 'value3']
})

df = pl.from_numpy(np.array_split(df, 2)[0], schema=df.columns, orient='row')
df = df.with_columns(pl.col('column1').cast(pl.Utf8))

英文:

I have a polars dataframe which I split into multiple frames using np.array_split. After the split and the conversion back to the polars dataframe all columns have the data type 'object'. When I want to change the data type using cast() I get the following error:<br><br>ComputeError: cannot cast 'Object' type<br><br>What am I doing wrong?/How can I fix this? I need the columns to be different data types for further processing.

df = pl.DataFrame({
    &#39;column1&#39;: [&#39;2021-01-01&#39;, &#39;2021-02-02&#39;, &#39;2021-03-03&#39;],
    &#39;column2&#39;: [&#39;value1&#39;, &#39;value2&#39;, &#39;value3&#39;]
})

df = pl.from_numpy(np.array_split(df, 2)[0], schema=df.columns, orient=&#39;row&#39;)
df = df.with_columns(pl.col(&#39;column1&#39;).cast(pl.Utf8))

答案1

得分: 2

Pandas 似乎会执行某些操作，最终从 np.array_split() 返回一个 Dataframe：

>>> np.array_split(df.to_pandas(), 2)[0]
      column1 column2
0  2021-01-01  value1
1  2021-02-02  value2
2  2021-03-03  value3

Polars 不会这样做：

>>> np.array_split(df, 2)[0]
array([['2021-01-01', 'value1'],
       ['2021-02-02', 'value2'],
       ['2021-03-03', 'value3']], dtype=object)

你可以使用行数和取模 (%) 来创建分组，而不是使用 np.array_split：

df = pl.DataFrame({
    'column1': ['2021-01-01', '2021-02-02', '2021-03-03', '2021-04-04', '2021-05-05'],
    'column2': ['value1', 'value2', 'value3', 'value4', 'value5']
})

(df.with_row_count(offset=1)
   .with_columns(group = (pl.col('row_nr') % 2 != 0).cumsum())
)

根据目标，你可以使用 .groupby() 或 .partition_by() 来拆分数据框。

英文:

Pandas appears to do something which ends up returning a Dataframe back from np.array_split()

&gt;&gt;&gt; np.array_split(df.to_pandas(), 2)[0]
      column1 column2
0  2021-01-01  value1
1  2021-02-02  value2
2  2021-03-03  value3

Polars doesn't do this:

&gt;&gt;&gt; np.array_split(df, 2)[0]
array([[&#39;2021-01-01&#39;, &#39;value1&#39;],
       [&#39;2021-02-02&#39;, &#39;value2&#39;],
       [&#39;2021-03-03&#39;, &#39;value3&#39;]], dtype=object)

Instead of np.array_split you could use the row count and modulo (%) to create groups:

df = pl.DataFrame({
    &#39;column1&#39;: [&#39;2021-01-01&#39;, &#39;2021-02-02&#39;, &#39;2021-03-03&#39;, &#39;2021-04-04&#39;, &#39;2021-05-05&#39;],
    &#39;column2&#39;: [&#39;value1&#39;, &#39;value2&#39;, &#39;value3&#39;, &#39;value4&#39;, &#39;value5&#39;]
})

(df.with_row_count(offset=1)
   .with_columns(group = (pl.col(&#39;row_nr&#39;) % 2 != 0).cumsum())
)

shape: (5, 4)
┌────────┬────────────┬─────────┬───────┐
│ row_nr ┆ column1    ┆ column2 ┆ group │
│ ---    ┆ ---        ┆ ---     ┆ ---   │
│ u32    ┆ str        ┆ str     ┆ u32   │
╞════════╪════════════╪═════════╪═══════╡
│ 1      ┆ 2021-01-01 ┆ value1  ┆ 1     │
│ 2      ┆ 2021-02-02 ┆ value2  ┆ 1     │
│ 3      ┆ 2021-03-03 ┆ value3  ┆ 2     │
│ 4      ┆ 2021-04-04 ┆ value4  ┆ 2     │
│ 5      ┆ 2021-05-05 ┆ value5  ┆ 3     │
└────────┴────────────┴─────────┴───────┘

Depending on the goal, you could then use .groupby() or .partition_by() to split the dataframe.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Polars – ComputeError: 从NumPy数组转换后无法将类型转换为’Object’类型

问题

答案1

Polars / Python 限制打印表格输出行数

如何比较两个数据框并返回仅包含已更改记录的新数据框。

为什么签名的验证不同？

从XLSM文件中用Python提取VBA为TXT。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论