2023年3月7日 22:41:02go评论167阅读模式

英文:

How to log Python code memory consumption?

问题

使用 Docker 容器运行一个包含 Python 应用程序的代码，该代码执行一些计算任务，我想要通过日志监视其内存消耗（以便查看计算的不同部分的性能）。我不需要图表或连续监视 - 我可以接受此方法的不准确性。
如何在不降低性能的情况下执行此操作？
使用外部工具（AWS）来监视已使用的内存不合适，因为我经常使用日志进行调试，因此很难将日志与性能图匹配。此外，分辨率太小。
使用python:3.10作为基础 Docker 镜像
使用 Python 3.10
在 AWS ECS Fargate 中运行（但在本地测试时结果相似）
使用 asyncio 运行计算方法
我已经阅读了一些关于tracemalloc的文章，但它说会大大降低性能（约30%）。文章链接。
我已尝试了以下方法，但每次调用时都显示相同的内存使用情况，因此我怀疑它是否按预期工作。

使用resource

import asyncio
import resource
# 本地导入
from utils import logger
def get_usage():
    usage = round(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000 / 1000, 4)
    logger.info(f"当前内存使用情况为: {usage} MB")
    return usage
# 执行计算 - 示例
asyncio.run(
  some_method_to_do_calculations()
)

从 Cloudwatch 获取的日志
如何记录Python代码的内存消耗？

使用psutil（正在测试中）

import psutil
# 本地导入
from utils import logger
def get_usage():
    total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
    used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
    pct = round(used / total * 100, 1)
    logger.info(f"当前内存使用情况为: {used} / {total} MB ({pct} %)")
    return True

英文:

Question

Hi, I am runnin' a Docker container with a Python application inside. The code performs some computing tasks and I would like to monitor it's memory consumption using logs (so I can see how different parts of the calculations perform). I do not need any charts or continous monitoring - I am okay with the inaccuracy of this approach.

How should I do it without loosing performance?

Using external (AWS) tools to monitor used memory is not suitable, because I often debug using logs and thus it's very difficult to match logs with performance charts. Also the resolution is too small.

Setup

using python:3.10 as base docker image
using Python 3.10
running in AWS ECS Fargate (but results are similar while testing on local)
running the calculation method using asyncio

I have read some articles about tracemalloc, but it says it degrades the performance a lot (around 30 %). The article.

Tried methods

I have tried the following method, however it shows the same memory usage every time called. So I doubt it works the desired way.

Using resource

import asyncio
import resource
# Local imports
from utils import logger
def get_usage():
    usage = round(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000 / 1000, 4)
    logger.info(f&quot;Current memory usage is: {usage} MB&quot;)
    return usage
# Do calculation - EXAMPLE
asyncio.run(
  some_method_to_do_calculations()
)

Logs from Cloudwatch

Using psutil (in testing)

import psutil
# Local imports
from utils import logger
def get_usage():
    total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
    used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
    pct = round(used / total * 100, 1)
    logger.info(f&quot;Current memory usage is: {used} / {total} MB ({pct} %)&quot;)
    return True

答案1

得分: 2

It seems like using psutil fits my needs pretty good. Thanks to all commenters! 🙏

Example

import psutil
# Local imports
from utils import logger
def get_usage():
    total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
    used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
    pct = round(used / total * 100, 1)
    logger.info(f"当前内存使用情况为: {used} / {total} MB ({pct} %)")
    return True

英文:

It seems like using psutil fits my needs pretty good. Thanks to all commenters! 🙏

Example

import psutil
# Local imports
from utils import logger
def get_usage():
    total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
    used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
    pct = round(used / total * 100, 1)
    logger.info(f&quot;Current memory usage is: {used} / {total} MB ({pct} %)&quot;)
    return True

答案2

得分: 1

Fargate 使用 cgroup 进行内存限制。

正如这里和这里所提到的，/proc 提供的 CPU/内存数值指的是主机，而不是容器。

因此，用户空间工具如 top 和 free 报告的数值会误导人。

您可以尝试类似以下的方式：

with open('/sys/fs/cgroup/memory/memory.stat', 'r') as f:
    for line in f:
        if 'hierarchical_memory_limit ' in line:
            memory_limit = int(line.split()[1])
        if 'total_rss ' in line:
            memory_usage = int(line.split()[1])
percentage = memory_usage * 100 / memory_limit

英文:

Fargate is using cgroup for memory limiting.

As mentioned here and here, the CPU/memory values provided by /proc refer to the host, not the container.

As a result, userspace tools such as top and free report misleading values.

You can try with something like:

with open(&#39;/sys/fs/cgroup/memory/memory.stat&#39;, &#39;r&#39;) as f:
    for line in f:
        if &#39;hierarchical_memory_limit &#39; in line:
            memory_limit = int(line.split()[1])
        if &#39;total_rss &#39; in line:
            memory_usage = int(line.split()[1])                     
percentage=memory_usage*100/memory_limit

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何记录Python代码的内存消耗？

问题

Question

Setup

Tried methods

答案1

答案2

Running multiple scripts in sequence in Python.

Deploy .zip archive (python code) to Azure Function using TerraForm

我需要合并元组的索引，我该怎么做？

statsmodels 的 `.summary()` 和 `.summary2()` 函数有什么区别？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。