英文:
How to free the memory space in CUDA texture object returned from a function wrapper?
问题
我有一个辅助函数,用于从另一个结构化数组(来自Matlab的mxArray
)创建CUDA对象的过程。函数如下:
cudaTextureObject_t tex_output = mxArrayToTexture(mxArray *inputMxArray);
这个函数会检查inputMxArray
的类型和大小,隐藏了创建cudaArray_t
、纹理资源、描述等的详细信息。当我需要创建不同维度、大小和类型的十几个纹理对象时,它运行得很好。但是,我不确定如何在之后清理并释放内存。
通常情况下,我使用cudaDestroyTextureObject(tex_output)
来销毁纹理对象。但是,我该如何释放分配给纹理对象内部的cudaArray_t
的内存?
我在每次运行代码之前和之后都检查了GPU上的可用内存,明显有约300MB的内存没有被释放。我猜测在辅助函数中没有使用cudaFreeArray()
来释放cudaArray_t
的内存是问题所在。如果我不解决这个问题,设备最终会耗尽内存空间。
有什么建议可以改进这个方法,使辅助函数正确创建CUDA纹理对象并释放内存?
英文:
Suppose I have a helper function to facilitate the process of creating CUDA objects from another structured array (mxArray
from Matlab, specifically).
It will be a function like this:
cudaTextureObject_t tex_output = mxArrayToTexture(mxArray * inputMxArray);
This function will check the type and size of inputMxArray
, hide the details of creating cudaArray_t
, texture resource, description, etc. It works great when I have a dozen of texture objects to create with different dimension, size and type. However, I am not sure how to clean up and free the memory afterwards.
Typically, I use cudaDestroyTextureObject(tex_output)
to destroy the texture object. But how do I free the memory allocated to the cudaArray_t
within the texture object?
I checked the available memory on GPU before and after each time I run the code, and there is definitively ~300Mb of memory not being released. I suppose not using cudaFreeArray()
on the cudaArray_t
hidden within the helper function is the issue. Eventually the device will run out of memory space if I don't fix this problem.
Any suggestions to improve this approach to have a helper function to create CUDA texture objects and free up the memory correctly?
答案1
得分: 1
以下是您要翻译的内容:
您可以使用cudaGetTextureObjectResourceDesc()
获取底层的cudaArray_t
。
例如:
void destroyAndFreeTexture(cudaTextureObject_t tex) {
cudaResourceDesc resDesc;
ERR_CHECK(cudaGetTextureObjectResourceDesc(&resDesc, tex));
ERR_CHECK(cudaDestroyTextureObject(tex));
if (resDesc.resType == cudaResourceTypeArray) {
cudaArray_t mem = resDesc.res.array.array;
ERR_CHECK(cudaFreeArray(mem));
}
else {
// ...
}
}
其中ERR_CHECK
应该替换为您用于CUDA运行时API的错误检查方案。
根据上下文,您可能希望在else
分支中出现错误/抛出异常,或者也处理其他选项(cudaResourceTypeMipmappedArray
、cudaResourceTypeLinear
和cudaResourceTypePitch2D
)。
个人建议还可以将此作为某个C++ RAII类的析构函数使用,就像Abator Abetor在评论中提出的那样。
英文:
You can get back the underlying cudaArray_t
using cudaGetTextureObjectResourceDesc()
.
I.e.:
void destroyAndFreeTexture(cudaTextureObject_t tex) {
cudaResourceDesc resDesc;
ERR_CHECK(cudaGetTextureObjectResourceDesc(&resDesc, tex));
ERR_CHECK(cudaDestroyTextureObject(tex));
if (resDesc.resType == cudaResourceTypeArray) {
cudaArray_t mem = resDesc.res.array.array;
ERR_CHECK(cudaFreeArray(mem));
}
else {
// ...
}
}
where ERR_CHECK
should be substituted by whatever error checking scheme you use for the CUDA runtime API.
Depending on the context, you might want to error/throw and exception in the else
branch or just also handle the other options (cudaResourceTypeMipmappedArray
, cudaResourceTypeLinear
and cudaResourceTypePitch2D
).
I personally would also use this as the destructor of some C++ RAII class, as Abator Abetor proposed in the comments.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论