英文:
Dataframe conversion from pandas to polars -- difference in the final dimensions
问题
I'm trying to convert a Pandas Dataframe to a Polar one.
我正在尝试将Pandas数据帧转换为极坐标数据帧。
I simply used the function result_polars = pl.from_pandas(result)
.
我只是使用了函数result_polars = pl.from_pandas(result)
。
Conversion proceeds well, but when I check the shape of the two dataframe I get that the Polars one has half the size of the original Pandas Dataframe.
转换进行得很顺利,但当我检查这两个数据帧的形状时,我发现Polars数据帧的大小只有原始Pandas数据帧的一半。
I believe that 4172903059 in length is almost the maximum dimension that the polars dataframe allows.
我相信4172903059的长度几乎是Polars数据帧允许的最大维度。
Does anyone have suggestions?
有人有建议吗?
Here a screenshot of the shape of the two dataframes.
这里是两个数据帧形状的截图。
Here a Minimum working example
以下是一个最小工作示例
import polars as pl
import pandas as pd
import numpy as np
df = pd.DataFrame(np.zeros((4292903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)
Using these dimensions the two dataframes have the same size. If instead I put the following:
使用这些尺寸,两个数据帧的大小相同。相反,如果我使用以下尺寸:
import polars as pl
import pandas as pd
import numpy as np
df = pd.DataFrame(np.zeros((4392903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)
The Polars dataframe has much smaller dimension (97935773).
Polars数据帧的维度要小得多(97935773)。
英文:
I'm trying to convert a Pandas Dataframe to a Polar one.
I simply used the function result_polars = pl.from_pandas(result)
. Conversion proceeds well, but when I check the shape of the two dataframe I get that the Polars one has half the size of the original Pandas Dataframe.
I believe that 4172903059 in length is almost the maximum dimension that the polars dataframe allows.
Does anyone have suggestions?
Here a screenshot of the shape of the two dataframes.
Here a Minimum working example
import polars as pl
import pandas as pd
import numpy as np
df = pd.DataFrame(np.zeros((4292903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)
Using these dimensions the two dataframes have the same size. If instead I put the following:
import polars as pl
import pandas as pd
import numpy as np
df = pd.DataFrame(np.zeros((4392903069,1), dtype=np.uint8))
df_polars = pl.from_pandas(df)
The Polars dataframe has much smaller dimension (97935773).
答案1
得分: 4
用pip install polars
安装的默认 polars 轮子“仅”允许 2^32,约 42 亿行。
如果需要更多,请安装pip install polars-u64-idx
并卸载先前的安装。
英文:
The default polars wheel retrieved with pip install polars
"only" allows for 2^32 e.g. ~4.2 billion rows.
Do you need more than that install pip install polars-u64-idx
and uninstall the previous installation.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论