从pandas到polars的Dataframe转换–最终维度的差异

huangapple go评论133阅读模式
英文:

Dataframe conversion from pandas to polars -- difference in the final dimensions

问题

I'm trying to convert a Pandas Dataframe to a Polar one.
我正在尝试将Pandas数据帧转换为极坐标数据帧。

I simply used the function result_polars = pl.from_pandas(result).
我只是使用了函数result_polars = pl.from_pandas(result)

Conversion proceeds well, but when I check the shape of the two dataframe I get that the Polars one has half the size of the original Pandas Dataframe.
转换进行得很顺利,但当我检查这两个数据帧的形状时,我发现Polars数据帧的大小只有原始Pandas数据帧的一半。

I believe that 4172903059 in length is almost the maximum dimension that the polars dataframe allows.
我相信4172903059的长度几乎是Polars数据帧允许的最大维度。

Does anyone have suggestions?
有人有建议吗?

Here a screenshot of the shape of the two dataframes.
这里是两个数据帧形状的截图。

Here a Minimum working example
以下是一个最小工作示例

import polars as pl
import pandas as pd
import numpy as np

df = pd.DataFrame(np.zeros((4292903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)

Using these dimensions the two dataframes have the same size. If instead I put the following:
使用这些尺寸,两个数据帧的大小相同。相反,如果我使用以下尺寸:

import polars as pl
import pandas as pd
import numpy as np

df = pd.DataFrame(np.zeros((4392903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)

The Polars dataframe has much smaller dimension (97935773).
Polars数据帧的维度要小得多(97935773)。

英文:

I'm trying to convert a Pandas Dataframe to a Polar one.

I simply used the function result_polars = pl.from_pandas(result). Conversion proceeds well, but when I check the shape of the two dataframe I get that the Polars one has half the size of the original Pandas Dataframe.

I believe that 4172903059 in length is almost the maximum dimension that the polars dataframe allows.

Does anyone have suggestions?

Here a screenshot of the shape of the two dataframes.

Here a Minimum working example

import polars as pl
import pandas as pd
import numpy as np

df = pd.DataFrame(np.zeros((4292903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)

Using these dimensions the two dataframes have the same size. If instead I put the following:

import polars as pl
import pandas as pd
import numpy as np

df = pd.DataFrame(np.zeros((4392903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)

The Polars dataframe has much smaller dimension (97935773).

答案1

得分: 4

pip install polars安装的默认 polars 轮子“仅”允许 2^32,约 42 亿行。

如果需要更多,请安装pip install polars-u64-idx并卸载先前的安装。

英文:

The default polars wheel retrieved with pip install polars "only" allows for 2^32 e.g. ~4.2 billion rows.

Do you need more than that install pip install polars-u64-idx and uninstall the previous installation.

huangapple
  • 本文由 发表于 2023年2月8日 18:31:06
  • 转载请务必保留本文链接:https://go.coder-hub.com/75384451.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定