问题

我在运行这段代码时遇到了Cuda内存耗尽的问题：

prompt_node = PromptNode(model_name_or_path = 'google/flan-t5-xl',
default_prompt_template=lfqa_prompt,
use_gpu=True,
max_length=300)

我尝试通过Cuda解决了这个问题。在检索器中，我使用了GPU，一切正常。只有在使用GPU时，使用提示节点时才会出现问题。有关如何解决此问题的建议吗？

错误信息如下：

torch.cuda.OutOfMemoryError: CUDA内存不足。尝试分配16.00 MiB（GPU 0；总容量14.85 GiB；已分配4.02 GiB；剩余17.44 MiB；PyTorch总共保留了4.02 GiB）如果保留的内存大于已分配的内存，请尝试设置max_split_size_mb以避免碎片化。请参阅内存管理和PYTORCH_CUDA_ALLOC_CONF文档。

英文:

I am having Cuda ran out of memory issue while running this code:

prompt_node = PromptNode(model_name_or_path = &#39;google/flan-t5-xl&#39;,
default_prompt_template=lfqa_prompt,
use_gpu=True,
max_length=300)

I tried to solve the issue with Cuda. I am using GPU with the retriever, and it works fine. Only having the issue when I use the prompt node with GPU. Any suggestion on how to fix it?

The error is:

> torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate
> 16.00 MiB (GPU 0; 14.85 GiB total capacity; 4.02 GiB already allocated; 17.44 MiB free; 4.02 GiB reserved in total by PyTorch) If
> reserved memory is >> allocated memory try setting max_split_size_mb
> to avoid fragmentation. See documentation for Memory Management and
> PYTORCH_CUDA_ALLOC_CONF.

答案1

得分: 2

对于您正在使用的模型“google/flan-t5-xl”，有一些较小的替代品，比如“google/flan-t5-small”或“google/flan-t5-base”。它们需要更少的内存，这可能是我的建议。

量化将是一种不同的方法。Haystack目前尚不支持量化，但我认为添加这个功能可能不会太困难，也许您可以通过GitHub问题提出功能请求？

在您发布的特定错误消息中，似乎没有使用所有的GPU内存。由于某种原因，似乎限制为14.85 GiB中的4GiB。这很可能与模型无关，而与torch或执行环境中的错误有关。您尝试在新环境中运行过吗？您可能需要检查一下您的问题是否类似于以下torch问题之一：
https://github.com/pytorch/pytorch/issues/40002 或 https://github.com/pytorch/pytorch/issues/67680

英文:

For the model you are using, 'google/flan-t5-xl', there are some smaller alternatives, such as 'google/flan-t5-small' or 'google/flan-t5-base'. They require less memory and that would be my suggestion here.

Quantization would be a different approach. Haystack doesn't support quantization out of the box yet but I believe it wouldn't be to difficult to add so maybe you can make a feature request through a GitHub issue?

In the particular error message you posted, it seems that not all of the GPU memory is used. For some reason it seems to be limited to 4GiB out of the 14.85 GiB. Could well be that it's not related to the model but a bug in torch or with the execution environment. Have you tried running it in a fresh environment? You might want to check whether your problem is similar to one of the following torch issues:
https://github.com/pytorch/pytorch/issues/40002 or https://github.com/pytorch/pytorch/issues/67680

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Cuda memory error while running haystack prompt node with gpu

问题

答案1

TensorFlow多类和多标签分类与排名的正确损失函数

如何正确渲染YoloV7框在图像上？

Triple tensor product with Tensorflow 使用Tensorflow进行三重张量积

为什么我的神经网络中损失和准确率数字太高？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论