2020年3月16日 17:55:56go评论110阅读模式

英文:

Java runs out of memory. Is this not a memory leak?

问题

我已将我们的Java程序分配了2GB的内存。在某个特定线程运行的时候，内存稳步而线性地增加，直到Kubernetes将其终止，因为它达到了我分配的2GB限制。当然，我们正在考虑是否存在内存泄漏，但是我们在gc日志中经常看到类似以下的情况：

[7406.381秒][信息][gc] GC(8326) Pause Full (System.gc()) 130M->65M(214M) 157.995毫秒

由于内存在增加，而这些日志表明堆内存并没有增加，是否没有必要调查内存泄漏？
内存增加的其他可能原因是什么？

一些背景信息：

没有日志表明容器已停止或终止。K8s中也没有事件（但是"restarts" = 1）。上述日志行是我们在看到Spring Boot / Tomcat启动之前的最后一行日志（因此它必须已被重新启动）。我们发现这正好发生在内存图在Grafana中达到2GB线时。如果没有Grafana，我们可能需要一段时间才能发现与内存有关的问题。

Kubernetes部署的YAML部分：

spec:
template:
spec:
containers:
- name: ...（省略）
resources:
limits:
cpu: 1200m
memory: 2Gi
requests:
cpu: 50m
memory: 50Mi

Dockerfile的最后一行：

ENTRYPOINT ["java", "-Xmx2G", "-verbose:gc", "-jar", "/backend.jar"]

其中"-verbose:gc" 会导致像我上面引用的那行日志一样的输出。

要重现这个问题需要一些时间，但我们确实做了几次。

我们正在使用Java 11。

英文:

I have assigned our Java program 2GB of memory. During the hours a particular thread is running, memory steadily and linearly increases until Kubernetes kills it because it reaches the 2GB limit I assigned. Of course we were thinking of a memory leak, but we see something like this all the time in the gc log:

[7406.381s][info][gc] GC(8326) Pause Full (System.gc()) 130M-&gt;65M(214M) 157.995ms

Since the memory increases linearly while these logs indicates that the heap memory does not increase, is it useless to investigate memory leaks?
What could be other likely causes of the increasing memory?

Some background info:

There are no logs that say the container was stopped or killed. There are also no events in k8s (however "restarts" = 1). The above log line was the last log line before we see (in Graylog) that Spring Boot / Tomcat is starting (hence it must have been restarted). We see this happening exactly at the time when the memory graph reaches the 2GB line in Grafana. Without Grafana it would have taken a while before we figured out it was something related to memory.

Kubernetes deploy yml part:

spec:
  template:
    spec:
      containers:
        - name: ... (omitted)
          resources:
            limits:
              cpu: 1200m
              memory: 2Gi
            requests:
              cpu: 50m
              memory: 50Mi

Last line of Dockerfile:

ENTRYPOINT [&quot;java&quot;, &quot;-Xmx2G&quot;, &quot;-verbose:gc&quot;, &quot;-jar&quot;, &quot;/backend.jar&quot;]

where "-verbose:gc" causes the log lines like the line I quoted above.

It takes a while to reproduce the problem, but we did that a couple of times.

We're using Java 11.

答案1

得分: 4

我认为你根本没有内存泄漏问题，你只是在错误地使用了选项。通过使用 -Xmx2G，你告诉 Java 可以使用最多 2G 的堆内存。与此同时，你告诉 Kubernetes 内存的绝对限制是 2Gi。现在，Java 使用的内存不在堆内，因此当它尝试将堆扩展到 2G 时，内存不足，导致容器被终止。

要解决这个问题，请确保为堆外内存留出合理的余地。暂时将 Kubernetes 的内存限制增加到 3G，然后在了解需要多少本机内存时将其缩减。我猜 2.5G 是一个合理的水平，但这只是一个猜测。或者你可以减小 Java 堆的大小，使用 1.5G（或更少）的堆内存，以便为本机内存留出一些空间。

英文:

I don't think you have a leak at all, you are just using the options wrong. With -Xmx2G you are telling Java that it can use up to 2G for the heap. At the same time you are telling Kubernetes that the absolute limit for memory is 2Gi. Now, Java uses memory that is not on the heap, so when it tries to expand the heap to 2G it runs out and the pod is killed.

To fix the problem make sure that you allow a reasonable margin for the memory that is outside the heap. Increase the Kubernetes limit to 3G temporarily and then scale it down when you know how much native memory you need. I would guess that 2.5G is a reasonable level, but that is just a guess. Alternatively you can decrease the Java heap size and run with a 1.5G heap (or less) to leave some room for the native memory.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Java耗尽内存。这不是内存泄漏吗？

问题

答案1

检测开发者选项是否已启用的绝佳方法？

如何在不安装JDK/JRE的情况下运行Java应用程序？

Request focus on edittext is not working correctly

无法在Angular + Spring Boot项目中使用*ngFor显示元素列表。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。