Java耗尽内存。这不是内存泄漏吗?

huangapple go评论77阅读模式
英文:

Java runs out of memory. Is this not a memory leak?

问题

我已将我们的Java程序分配了2GB的内存。在某个特定线程运行的时候,内存稳步而线性地增加,直到Kubernetes将其终止,因为它达到了我分配的2GB限制。当然,我们正在考虑是否存在内存泄漏,但是我们在gc日志中经常看到类似以下的情况:

[7406.381秒][信息][gc] GC(8326) Pause Full (System.gc()) 130M->65M(214M) 157.995毫秒

  1. 由于内存在增加,而这些日志表明堆内存并没有增加,是否没有必要调查内存泄漏?
  2. 内存增加的其他可能原因是什么?

一些背景信息:

没有日志表明容器已停止或终止。K8s中也没有事件(但是"restarts" = 1)。上述日志行是我们在看到Spring Boot / Tomcat启动之前的最后一行日志(因此它必须已被重新启动)。我们发现这正好发生在内存图在Grafana中达到2GB线时。如果没有Grafana,我们可能需要一段时间才能发现与内存有关的问题。

Kubernetes部署的YAML部分:

spec:
template:
spec:
containers:
- name: ...(省略)
resources:
limits:
cpu: 1200m
memory: 2Gi
requests:
cpu: 50m
memory: 50Mi

Dockerfile的最后一行:

ENTRYPOINT ["java", "-Xmx2G", "-verbose:gc", "-jar", "/backend.jar"]

其中"-verbose:gc" 会导致像我上面引用的那行日志一样的输出。

要重现这个问题需要一些时间,但我们确实做了几次。

我们正在使用Java 11。

英文:

I have assigned our Java program 2GB of memory. During the hours a particular thread is running, memory steadily and linearly increases until Kubernetes kills it because it reaches the 2GB limit I assigned. Of course we were thinking of a memory leak, but we see something like this all the time in the gc log:

[7406.381s][info][gc] GC(8326) Pause Full (System.gc()) 130M->65M(214M) 157.995ms
  1. Since the memory increases linearly while these logs indicates that the heap memory does not increase, is it useless to investigate memory leaks?
  2. What could be other likely causes of the increasing memory?

Some background info:

There are no logs that say the container was stopped or killed. There are also no events in k8s (however "restarts" = 1). The above log line was the last log line before we see (in Graylog) that Spring Boot / Tomcat is starting (hence it must have been restarted). We see this happening exactly at the time when the memory graph reaches the 2GB line in Grafana. Without Grafana it would have taken a while before we figured out it was something related to memory.

Kubernetes deploy yml part:

spec:
  template:
    spec:
      containers:
        - name: ... (omitted)
          resources:
            limits:
              cpu: 1200m
              memory: 2Gi
            requests:
              cpu: 50m
              memory: 50Mi

Last line of Dockerfile:

ENTRYPOINT ["java", "-Xmx2G", "-verbose:gc", "-jar", "/backend.jar"]

where "-verbose:gc" causes the log lines like the line I quoted above.

It takes a while to reproduce the problem, but we did that a couple of times.

We're using Java 11.

答案1

得分: 4

我认为你根本没有内存泄漏问题,你只是在错误地使用了选项。通过使用 -Xmx2G,你告诉 Java 可以使用最多 2G 的堆内存。与此同时,你告诉 Kubernetes 内存的绝对限制是 2Gi。现在,Java 使用的内存不在堆内,因此当它尝试将堆扩展到 2G 时,内存不足,导致容器被终止。

要解决这个问题,请确保为堆外内存留出合理的余地。暂时将 Kubernetes 的内存限制增加到 3G,然后在了解需要多少本机内存时将其缩减。我猜 2.5G 是一个合理的水平,但这只是一个猜测。或者你可以减小 Java 堆的大小,使用 1.5G(或更少)的堆内存,以便为本机内存留出一些空间。

英文:

I don't think you have a leak at all, you are just using the options wrong. With -Xmx2G you are telling Java that it can use up to 2G for the heap. At the same time you are telling Kubernetes that the absolute limit for memory is 2Gi. Now, Java uses memory that is not on the heap, so when it tries to expand the heap to 2G it runs out and the pod is killed.

To fix the problem make sure that you allow a reasonable margin for the memory that is outside the heap. Increase the Kubernetes limit to 3G temporarily and then scale it down when you know how much native memory you need. I would guess that 2.5G is a reasonable level, but that is just a guess. Alternatively you can decrease the Java heap size and run with a 1.5G heap (or less) to leave some room for the native memory.

huangapple
  • 本文由 发表于 2020年3月16日 17:55:56
  • 转载请务必保留本文链接:https://go.coder-hub.com/60703763.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定