如何记录Python代码的内存消耗?

huangapple go评论167阅读模式
英文:

How to log Python code memory consumption?

问题

  • 使用 Docker 容器运行一个包含 Python 应用程序的代码,该代码执行一些计算任务,我想要通过日志监视其内存消耗(以便查看计算的不同部分的性能)。我不需要图表或连续监视 - 我可以接受此方法的不准确性。

  • 如何在不降低性能的情况下执行此操作?

  • 使用外部工具(AWS)来监视已使用的内存不合适,因为我经常使用日志进行调试,因此很难将日志与性能图匹配。此外,分辨率太小。

  • 使用python:3.10作为基础 Docker 镜像

  • 使用 Python 3.10

  • 在 AWS ECS Fargate 中运行(但在本地测试时结果相似)

  • 使用 asyncio 运行计算方法

  • 我已经阅读了一些关于tracemalloc的文章,但它说会大大降低性能(约30%)。文章链接

  • 我已尝试了以下方法,但每次调用时都显示相同的内存使用情况,因此我怀疑它是否按预期工作。

  • 使用resource

    1. import asyncio
    2. import resource
    3. # 本地导入
    4. from utils import logger
    5. def get_usage():
    6. usage = round(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000 / 1000, 4)
    7. logger.info(f"当前内存使用情况为: {usage} MB")
    8. return usage
    9. # 执行计算 - 示例
    10. asyncio.run(
    11. some_method_to_do_calculations()
    12. )

    从 Cloudwatch 获取的日志
    如何记录Python代码的内存消耗?

  • 使用psutil(正在测试中)

    1. import psutil
    2. # 本地导入
    3. from utils import logger
    4. def get_usage():
    5. total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
    6. used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
    7. pct = round(used / total * 100, 1)
    8. logger.info(f"当前内存使用情况为: {used} / {total} MB ({pct} %)")
    9. return True
英文:

Question

Hi, I am runnin' a Docker container with a Python application inside. The code performs some computing tasks and I would like to monitor it's memory consumption using logs (so I can see how different parts of the calculations perform). I do not need any charts or continous monitoring - I am okay with the inaccuracy of this approach.

How should I do it without loosing performance?

Using external (AWS) tools to monitor used memory is not suitable, because I often debug using logs and thus it's very difficult to match logs with performance charts. Also the resolution is too small.

Setup

  • using python:3.10 as base docker image
  • using Python 3.10
  • running in AWS ECS Fargate (but results are similar while testing on local)
  • running the calculation method using asyncio

I have read some articles about tracemalloc, but it says it degrades the performance a lot (around 30 %). The article.

Tried methods

I have tried the following method, however it shows the same memory usage every time called. So I doubt it works the desired way.

Using resource

  1. import asyncio
  2. import resource
  3. # Local imports
  4. from utils import logger
  5. def get_usage():
  6. usage = round(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1000 / 1000, 4)
  7. logger.info(f"Current memory usage is: {usage} MB")
  8. return usage
  9. # Do calculation - EXAMPLE
  10. asyncio.run(
  11. some_method_to_do_calculations()
  12. )

Logs from Cloudwatch
如何记录Python代码的内存消耗?

Using psutil (in testing)

  1. import psutil
  2. # Local imports
  3. from utils import logger
  4. def get_usage():
  5. total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
  6. used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
  7. pct = round(used / total * 100, 1)
  8. logger.info(f"Current memory usage is: {used} / {total} MB ({pct} %)")
  9. return True

答案1

得分: 2

It seems like using psutil fits my needs pretty good. Thanks to all commenters! 🙏

Example

  1. import psutil
  2. # Local imports
  3. from utils import logger
  4. def get_usage():
  5. total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
  6. used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
  7. pct = round(used / total * 100, 1)
  8. logger.info(f"当前内存使用情况为: {used} / {total} MB ({pct} %)")
  9. return True
英文:

It seems like using psutil fits my needs pretty good. Thanks to all commenters! 🙏

Example

  1. import psutil
  2. # Local imports
  3. from utils import logger
  4. def get_usage():
  5. total = round(psutil.virtual_memory().total / 1000 / 1000, 4)
  6. used = round(psutil.virtual_memory().used / 1000 / 1000, 4)
  7. pct = round(used / total * 100, 1)
  8. logger.info(f"Current memory usage is: {used} / {total} MB ({pct} %)")
  9. return True

答案2

得分: 1

Fargate 使用 cgroup 进行内存限制。

正如这里这里所提到的,/proc 提供的 CPU/内存数值指的是主机,而不是容器

因此,用户空间工具如 topfree 报告的数值会误导人。

您可以尝试类似以下的方式:

  1. with open('/sys/fs/cgroup/memory/memory.stat', 'r') as f:
  2. for line in f:
  3. if 'hierarchical_memory_limit ' in line:
  4. memory_limit = int(line.split()[1])
  5. if 'total_rss ' in line:
  6. memory_usage = int(line.split()[1])
  7. percentage = memory_usage * 100 / memory_limit
英文:

Fargate is using cgroup for memory limiting.

As mentioned here and here, the CPU/memory values provided by /proc refer to the host, not the container.

As a result, userspace tools such as top and free report misleading values.

You can try with something like:

  1. with open('/sys/fs/cgroup/memory/memory.stat', 'r') as f:
  2. for line in f:
  3. if 'hierarchical_memory_limit ' in line:
  4. memory_limit = int(line.split()[1])
  5. if 'total_rss ' in line:
  6. memory_usage = int(line.split()[1])
  7. percentage=memory_usage*100/memory_limit

huangapple
  • 本文由 发表于 2023年3月7日 22:41:02
  • 转载请务必保留本文链接:https://go.coder-hub.com/75663384.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定