Cuda memory error while running haystack prompt node with gpu

huangapple go评论43阅读模式
英文:

Cuda memory error while running haystack prompt node with gpu

问题

我在运行这段代码时遇到了Cuda内存耗尽的问题:

prompt_node = PromptNode(model_name_or_path = 'google/flan-t5-xl',
default_prompt_template=lfqa_prompt,
use_gpu=True,
max_length=300)

我尝试通过Cuda解决了这个问题。在检索器中,我使用了GPU,一切正常。只有在使用GPU时,使用提示节点时才会出现问题。有关如何解决此问题的建议吗?

错误信息如下:

torch.cuda.OutOfMemoryError: CUDA内存不足。尝试分配16.00 MiB(GPU 0;总容量14.85 GiB;已分配4.02 GiB;剩余17.44 MiB;PyTorch总共保留了4.02 GiB)如果保留的内存大于已分配的内存,请尝试设置max_split_size_mb以避免碎片化。请参阅内存管理和PYTORCH_CUDA_ALLOC_CONF文档。
英文:

I am having Cuda ran out of memory issue while running this code:

prompt_node = PromptNode(model_name_or_path = 'google/flan-t5-xl',
default_prompt_template=lfqa_prompt,
use_gpu=True,
max_length=300)

I tried to solve the issue with Cuda. I am using GPU with the retriever, and it works fine. Only having the issue when I use the prompt node with GPU. Any suggestion on how to fix it?

The error is:

> torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate
> 16.00 MiB (GPU 0; 14.85 GiB total capacity; 4.02 GiB already allocated; 17.44 MiB free; 4.02 GiB reserved in total by PyTorch) If
> reserved memory is >> allocated memory try setting max_split_size_mb
> to avoid fragmentation. See documentation for Memory Management and
> PYTORCH_CUDA_ALLOC_CONF.

答案1

得分: 2

对于您正在使用的模型“google/flan-t5-xl”,有一些较小的替代品,比如“google/flan-t5-small”或“google/flan-t5-base”。它们需要更少的内存,这可能是我的建议。

量化将是一种不同的方法。Haystack目前尚不支持量化,但我认为添加这个功能可能不会太困难,也许您可以通过GitHub问题提出功能请求?

在您发布的特定错误消息中,似乎没有使用所有的GPU内存。由于某种原因,似乎限制为14.85 GiB中的4GiB。这很可能与模型无关,而与torch或执行环境中的错误有关。您尝试在新环境中运行过吗?您可能需要检查一下您的问题是否类似于以下torch问题之一:
https://github.com/pytorch/pytorch/issues/40002https://github.com/pytorch/pytorch/issues/67680

英文:

For the model you are using, 'google/flan-t5-xl', there are some smaller alternatives, such as 'google/flan-t5-small' or 'google/flan-t5-base'. They require less memory and that would be my suggestion here.

Quantization would be a different approach. Haystack doesn't support quantization out of the box yet but I believe it wouldn't be to difficult to add so maybe you can make a feature request through a GitHub issue?

In the particular error message you posted, it seems that not all of the GPU memory is used. For some reason it seems to be limited to 4GiB out of the 14.85 GiB. Could well be that it's not related to the model but a bug in torch or with the execution environment. Have you tried running it in a fresh environment? You might want to check whether your problem is similar to one of the following torch issues:
https://github.com/pytorch/pytorch/issues/40002 or https://github.com/pytorch/pytorch/issues/67680

huangapple
  • 本文由 发表于 2023年6月21日 23:57:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/76525146.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定