英文:
How to reduce the memory usage during Python program execution
问题
我对我的程序中内存使用的一些问题有一些疑问。当我运行一个文件读取程序时,文件大小只有100MB,但我的进程的内存使用显示为1.6GB(我没有设置任何额外的变量,只导入了必要的库文件)。我理解在程序执行过程中还会存储许多其他东西在内存中,但我想知道是否有任何方法可以减少它们。当我将变量传输到GPU时,也会发生相同的情况,我的GPU显示使用了600MB。
代码
import numpy as np
import torch
import time
import struct
if __name__ == "__main__":
graphEdge = []
boundList = []
file_path = "./../../srcList.bin"
with open(file_path, 'rb') as file:
while True:
data = file.read(4)
if not data:
break
integer = struct.unpack('i', data)[0]
graphEdge.append(integer)
file_path = "./../../range.bin"
with open(file_path, 'rb') as file:
while True:
data = file.read(4)
if not data:
break
integer = struct.unpack('i', data)[0]
boundList.append(integer)
graphEdge = torch.Tensor(graphEdge).to(torch.int).to('cuda:0')
boundList = torch.Tensor(boundList).to(torch.int).to('cuda:0')
memory_size = graphEdge.element_size() * graphEdge.numel()
print(f"Tensor memory: {memory_size/(1024*1024)} MB")
请注意,我只翻译了非代码部分。
英文:
I have some questions about memory usage in my program. When I run a file-reading program, the file size is only 100MB, but the memory usage of my process shows as 1.6GB (I haven't set any additional variables and have only imported the necessary library files). I understand that there are many other things that are also stored in memory during program execution, but I would like to know if there is any way to reduce them. The same thing happens when I transfer variables to the GPU, and my GPU shows usage of 600MB.
code
import numpy as np
import torch
import time
import struct
if __name__ == "__main__":
graphEdge = []
boundList = []
file_path = "./../../srcList.bin"
with open(file_path, 'rb') as file:
while True:
data = file.read(4)
if not data:
break
integer = struct.unpack('i', data)[0]
graphEdge.append(integer)
file_path = "./../../range.bin"
with open(file_path, 'rb') as file:
while True:
data = file.read(4)
if not data:
break
integer = struct.unpack('i', data)[0]
boundList.append(integer)
graphEdge = torch.Tensor(graphEdge).to(torch.int).to('cuda:0')
boundList = torch.Tensor(boundList).to(torch.int).to('cuda:0')
memory_size = graphEdge.element_size() * graphEdge.numel()
print(f"Tensor memory: {memory_size/(1024*1024)} MB")
答案1
得分: 3
一个32位数字作为Python的int
对象占用32个字节。再加上列表对它的引用需要8个字节。所以文件中的4个字节在Python列表+整数中占用40个字节的内存。
你提到graphEdge
有29,856,864个数字,但只有589,563个不同的数字。因此,你有很多重复的数字,通过不存储相同值的独立int
对象,可以节省大量内存。只使用一个对象来表示每个不同的值,并重复使用该对象。一种方法是使用一个字典将每个值映射到其对象上。在开始时执行intern = {}.setdefault
,然后像这样追加:
graphEdge.append(intern(integer, integer))
当然,字典也会占用额外的内存,但它节省的内存远远超过了它的成本。我估计它将总共节省800MB左右的内存。
参见内存池在维基百科中的介绍。
如果你不创建一个list
的int
对象,而是创建一个Python或NumPy的32位整数array
,你甚至可以节省更多内存,但我不熟悉Torch和它接受的输入。
英文:
A 32-bit number as a Python int
object takes 32 bytes. Plus 8 bytes for the list's reference to it. So 4 bytes in the file takes 40 bytes in memory as Python list+ints.
You commented that graphEdge
has 29,856,864 numbers, but only 589,563 different ones. So you have many duplicates and you can save a lot of memory by not storing separate int
objects with the same value. Use only one object per different value, and use that object repeatedly. One way to do that is with a dictionary that maps each value to its object. Do intern = {}.setdefault
at the start, and then append like this:
graphEdge.append(intern(integer, integer))
The dictionary of course also takes extra memory, but it saves a lot more than it costs. I estimate it'll overall take 800 MB less.
See interning at Wikipedia.
You could save even more if you didn't create a list
of int
objects but a Python or NumPy array
of 32-bit ints, but I'm not familiar with Torch and what it accepts as input.
答案2
得分: 1
为了减小程序的大小,您可以尝试使用其他小尺寸的整数变量,如int16
或int8
,而不是32位大小的整数,如果数据可以在这些数据类型中进行调整。
英文:
To reduce the size of your program you can try to use any other small-size int variable eg. int16
or int8
instead of a 32-bit size integer, if the data can be adjusted in these datatypes.
答案3
得分: -1
我不知道在这种情况下是否能够有很大帮助,但在其他情况下,使用Python的垃圾回收器来释放不会再使用的变量所占用的空间可能会有所帮助。
您可以在测量内存之前尝试使用它来检查是否有任何差异:
import gc
...
gc.collect()
graphEdge = torch.Tensor(graphEdge).to(torch.int).to('cuda:0')
boundList = torch.Tensor(boundList).to(torch.int).to('cuda:0')
memory_size = graphEdge.element_size() * graphEdge.numel()
print(f"Tensor memory: {memory_size/(1024*1024)} MB")
英文:
I don't know if in this case can help a lot, but in other cases it helps to use the garbage collector of python to free space from variables than won't be used later.
You can try that by using it before you measure the memory to check if it makes any difference:
import gc
...
gc.collect()
graphEdge = torch.Tensor(graphEdge).to(torch.int).to('cuda:0')
boundList = torch.Tensor(boundList).to(torch.int).to('cuda:0')
memory_size = graphEdge.element_size() * graphEdge.numel()
print(f"Tensor memory: {memory_size/(1024*1024)} MB")
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论