英文:
How do I determine how much memory a MemoryStream object is using in VB.Net?
问题
I'm handling a long-running process that does a lot of copying to a memory stream so it can create an archive and upload it to azure. In excess of 2 GB in size. The process is throwing an Out Of Memory exception at some point, and I'm trying to figure out what the upper limit is to prevent that, stop the process, upload a partial archive, and pick up where it left off.
But I'm not sure how to determine how much memory the stream is actually using. Right now, I'm using MemoryStream.Length
, but I don't think that's the right approach.
So, how can I determine how much memory a MemoryStream is using in VB.Net?
英文:
I'm handling a long-running process that does a lot of copying to a memory stream so it can create an archive and upload it to azure. In excess of 2 GB in size. The process is throwing an Out Of Memory exception at some point, and I'm trying to figure out what the upper limit is to prevent that, stop the process, upload a partial archive, and pick up where it left off.
But I'm not sure how to determine how much memory the stream is actually using. Right now, I'm using MemoryStream.Length
, but I don't think that's the right approach.
So, how can I determine how much memory a MemoryStream is using in VB.Net?
In compliance with the Minimal, Reproducible Example requirements -
Sub Main(args As String())
Dim stream = New MemoryStream
stream.WriteByte(10)
'What would I need to put here to determine the size in memory of stream?
End Sub
答案1
得分: 1
这是与.Net Framework的垃圾收集器已知的问题(我听说.Net Core可能已经解决了这个问题,但我自己没有深入研究来确认)。发生的情况是,您有一个带有内部(逐渐增长)缓冲区的项目,比如字符串、StringBuilder、List、_MemoryStream_等。当您向对象写入数据时,缓冲区填充,并在某个时刻需要替换。因此,在幕后,分配了一个新的缓冲区,并将数据从旧缓冲区复制到新缓冲区。此时,旧缓冲区变得可以进行垃圾收集。
已经可以看到一个主要的低效性:从旧缓冲区复制到新缓冲区可能会导致显着的成本,因为对象增长。因此,如果您在一开始就知道项目的最终大小,那么您可能希望使用一种允许您设置初始大小的机制。例如,对于List<T>
,这意味着使用构造函数重载来包括Capacity
...如果您知道的话。这意味着对象永远不会(或很少)必须在缓冲区之间复制数据。
但我们还没有完成。当复制操作完成时,垃圾收集器将适当地回收内存并将其返回给操作系统。但不仅仅是内存,还有您的应用程序进程的虚拟地址空间。以前由该内存占用的虚拟地址空间不会立即释放。可以通过一种称为"紧凑"的过程来回收该内存,但有些情况下这种情况并没有发生。最值得注意的是,大对象堆(LOH)上的内存很少被压缩。一旦您的项目超过LOH的魔法85,000字节阈值,虚拟地址空间中会开始积累空洞。填充这些缓冲区足够多,虚拟地址表的空间就会用尽,结果就是(请敲击鼓声)...OutOfMemoryException
。这是抛出的异常,即使可能还有足够的内存可用。
由于大多数创建这种情况的项目都使用了底层缓冲区的加倍算法,因此有可能生成这些异常而没有使用太多内存,尽管这是最坏的情况,通常情况下第一次会稍微高一些。
为了避免这种情况,要小心使用具有动态增长缓冲区的项目:在能够的情况下一开始就设置完整的容量,或在不能设置时将数据流式传输到更强大的后备存储(如文件)中。
英文:
This is a known issue with the garbage collector for .Net Framework (I've heard .Net Core may have addressed this, but I haven't done a deep dive myself to confirm it).
What happens is you have an item with an internal (growing) buffer like a string, StringBuilder, List, MemoryStream, etc. As you write to the object, the buffer fills and at some point needs to be replaced. So behind the scenes a new buffer is allocated, and data is copied from the old to the new. At this point the old buffer becomes eligible for garbage collection.
Already we see one large inefficiency: copying from the old buffer to the new can be a significant cost as the the object grows. Therefore, if you have an idea of your item's final size up front, you probably want to use a mechanism that will let you set that initial size. With a List<T>
, for example, this means using the constructor overload to include a Capacity
... if you know it. This means the object never (or rarely) has to copy the data between buffers.
But we're not done yet. When the copy operation is finished, the garbage collector WILL appropriately reclaim the memory and return it to the operating system. However, there's more involved than just memory. There's also the virtual address space for your application's process. The virtual address space formerly occupied by that memory is NOT immediately released. It can be reclaimed through a process called "compaction", but there are situations where this just... doesn't happen. Most notably, memory on the Large Object Heap (LOH) is rarely-to-never compacted. Once your item eclipses a magic 85,000 byte threshold for the LOH, you're starting to build up holes in the virtual address space. Fill these buffers enough, and the virtual address table runs out of room, resulting in (drumroll please)... an OutOfMemoryException
. This is the exception thrown, even though there may be plenty of memory available.
Since most of the items that create this scenario use a doubling algorithm for the underlying buffer, it's possible to generate these exceptions without using all that much memory — just over the square root of the actual possible virtual address space, though that's worst-case and commonly you'll get somewhat higher first.
To avoid this, be careful in how you use items with buffers that grow dynamically: set the full capacity up front when you can, or stream to a more-robust backing store (like a file) when you can't.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论