2023年6月4日 22:51:32go评论108阅读模式

英文:

Polars memory usage as compared to {data.table}

问题

如何与R的data.table包在内存使用方面进行比较？

它如何处理浅复制？

是否支持/默认支持原地/按引用更新？

最近有关内存效率的四个主要内存数据处理库（polars vs data.table vs pandas vs dplyr）的性能基准吗？

英文:

Fairly new to python-polars.

How does it compare to Rs {data.table} package in terms of memory usage?

How does it handle shallow copying?

Is in-place/by reference updating possible/the default?

Are there any recent benchmarks on memory efficiency of the big 4 in-mem data wrangling libs (polars vs data.table vs pandas vs dplyr)?

答案1

得分: 3

How does it handle shallow copying?

Polars内存缓冲区是引用计数的写时复制。这意味着您永远不能在Polars内进行完整的数据复制。

Is in-place/by reference updating possible/the default?

不，您必须重新分配变量。在底层，Polars可能会重用内存缓冲区，但对用户来说是不可见的。

Are there any recent benchmarks on memory efficiency?

关于内存使用情况的问题也没有考虑到设计差异。Polars目前正在开发一款离线引擎。这个引擎不会在内存中处理所有数据，而是会从磁盘流式传输数据。该引擎的设计理念是根据需要使用尽可能多的内存，而不会导致OOM（内存耗尽）。未使用的内存是浪费的潜力。

英文:

> How does it handle shallow copying?

Polars memory buffers are reference counted Copy on Write. That means you can never do a full data copy within polars.

> Is in-place/by reference updating possible/the default?

No, you must reassign the variable. Under the hood polars' may reuse memory buffers, but that is not visible to the users.

> Are there any recent benchmarks on memory efficiency

The question how it relates in memory usage is also not doing respect to design differences. Polars currently is developing an out-of-core engine. This engine doesn't process all data in memory, but will stream data from disk. The design philosophy of that engine is to use as much memory as needed without going OOM. Unused memory, is wasted potential.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Polars相对于{data.table}的内存使用情况

问题

答案1

TypeError in Pycharm (Python) 的中文翻译是 “Pycharm 中的类型错误 (Python)”。

Error resulting from running code in Colab

Scheduling a python script with Windows task scheduler

如何在OpenCV Python中将HoughlinesP坐标合并为一条线？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。