问题

我刚刚发现，在pandas和Dask中，一个包含大量NaN的列的总和是0（为什么？！）。我需要所有NaN的总和为0，因为NaN表示这些值缺失，所以它们的总和也应该是NaN。

根据文档，似乎需要将min_count = 0传递以复制此行为。然而，我正在进行如下聚合操作：

ddf.groupby("code").aggregate({'rain':'sum'}).compute()

在aggregate函数中添加min_count参数似乎没有影响，而在'sum'的位置使用lambda会引发错误。

英文:

I just discovered today that a sum of a column full of NaNs is 0 in pandas and Dask (why?!). I need a sum of all NaNs to be 0, because having NaNs means those values are missing, so their sum should be NaN as well.

From the documentation it appears that you have to pass min_count = 0 to replicate this behaviour. However, I'm doing the sum into an aggregation that looks like this

ddf.groupby(&quot;code&quot;).aggregate({&#39;rain&#39;:&#39;sum&#39;}).compute()

Adding the argument min_count to the aggregate function seems to have no impact, while using a lambda in place of 'sum' causes an error.

答案1

得分: 1

import dask.dataframe as dd

# 我们定义自己的求和函数，处理 NaN 值
custom_sum = dd.Aggregation('custom_sum',
                            lambda s: s.sum(min_count=1),
                            lambda s0: s0.sum(min_count=1))

ddf.groupby("code").aggregate({'rain': custom_sum}).compute()

英文:

Finally found out how to do this with custom aggregations.

import dask.dataframe as dd

# We define our own sum that takes care of NaN values
custom_sum = dd.Aggregation(&#39;custom_sum&#39;,
                            lambda s: s.sum(min_count=1),
                            lambda s0: s0.sum(min_count=1))

ddf.groupby(&quot;code&quot;).aggregate({&#39;rain&#39;:custom_sum}).compute()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何向Dask中的聚合函数传递参数。

问题

答案1

在尝试在Django中创建订单时出现错误。

获取特定列中的最后一项在tkinter python中的方法是什么？

在PySpark中调优while循环（在循环中持久化或缓存数据框）。

TDD 修改我的测试以使我的代码通过

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论