pandas系列转为JSON内存泄漏

huangapple go评论78阅读模式
英文:

pandas series to_json memory leak

问题

我的生产服务的内存一直在不断增加,我认为根本原因是pandas.Series.to_json。

output:

gc_count=46619
gc_count=46619
gc_count=46620
gc_count=46621
gc_count=46622
gc_count=46623
gc_count=46624
gc_count=46625
gc_count=46626
gc_count=46627

有趣的是,第一次和第二次调用总是具有相同的GC计数,然后在每次迭代中增加一个。

有人之前遇到过这个问题吗?有没有办法避免内存泄漏?

[尝试过的Python版本:3.8和3.9]

更新:这似乎与 https://github.com/pandas-dev/pandas/issues/24889 有关,使用to_dict并使用json进行转换似乎是一种解决方法。

英文:

My production service's memory was constantly increasing, and I think the root cause is the pandas.Series.to_json.

import pandas as pd
import gc
for i in range(0,10):
    series = pd.Series([0.008, 0.002])
    json_string = series.to_json(orient="records")
    _ = gc.collect()
    print("gc_count={}".format(len(gc.get_objects())))

output:

gc_count=46619
gc_count=46619
gc_count=46620
gc_count=46621
gc_count=46622
gc_count=46623
gc_count=46624
gc_count=46625
gc_count=46626
gc_count=46627

What's interesting is that the first and the second call always has the same GC count, and then it starts increasing by one in each iteration.

Has anyone faced this before? Are there ways to avoid the memory leak?

[Python versions tried: 3.8 and 3.9]

Update: This seems to be related: https://github.com/pandas-dev/pandas/issues/24889 and using to_dict and converting it using json seems to be a workaround.

答案1

得分: 1

这个错误似乎在最新版本的pandas中已经修复。这个错误存在于pandas 1.1.3中,可以被稳定地重现。

可能的解决方案

  1. 升级到最新版本的Pandas。
  2. 如果必须使用较旧版本的Pandas,可以使用以下解决方法:

而不是

series.to_json(orient="records")

我们可以使用

str(list(series.to_dict().values()))
英文:

The bug seems to be fixed in the latest version of pandas. The bug is there in pandas 1.1.3, and can be reproduced consistently.

Possible solutions

  1. Upgrade to the latest version of Pandas.
  2. If you have to use older version of Pandas, we can have a workaround like the following:

Instead of

series.to_json(orient="records")

We can do

str(list(series.to_dict().values()))

huangapple
  • 本文由 发表于 2023年2月24日 06:38:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75551043.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定