英文:
What is the time complexity of pandas groupBy function?
问题
我尝试找到它,但无法在任何地方找到它。我阅读的关于 group by 的文章说 groupBy 通过分割和分箱项目来工作,但我无法确信地猜测出时间复杂度。
链接:https://www.geeksforgeeks.org/pandas-groupby/
我还查了一下 groupBy 的实现,但很遗憾我无法理解它。
英文:
I tried finding it, but couldn't find it anywhere. This article I read about group by says that groupBy works by splitting and binning the items, but I couldn't convincingly guess the time complexity.
https://www.geeksforgeeks.org/pandas-groupby/
I also looked up groupBy's implementation, but I couldn't make sense of it sadly.
答案1
得分: 2
分割组是O(n)
,其中n
是行数。
由于groupby
默认按组排序,假设k
是唯一组的数量,复杂度为O(n + k*log(k))
,这就是为什么文档建议“通过关闭此功能获得更好的性能”。
英文:
Splitting the groups is O(n)
with n
the number of rows.
Since groupby
sorts the groups by default, assuming k
the number of unique groups, the complexity is O(n + k*log(k))
, which is why the documentation recommends "Get better performance by turning this off".
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论