英文:
Getting ModuleNotFoundError: No module named 'torch.distributed._shard'
问题
I'm running some python code that uses the pytorch Lightning framework. I get the error
> File "/Home/LightningVersion.py", line 45, in init
super().init()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 128, in init
self._register_sharded_tensor_state_dict_hooks_if_available()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 1570, in _register_sharded_tensor_state_dict_hooks_if_available
from torch.distributed._shard.sharded_tensor import pre_load_state_dict_hook, state_dict_hook
ModuleNotFoundError: No module named 'torch.distributed._shard'
我正在运行一些使用pytorch Lightning框架的Python代码。我遇到了以下错误:
> File "/Home/LightningVersion.py", 第45行,在 init 中
super().init()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", 第128行,在 init 中
self._register_sharded_tensor_state_dict_hooks_if_available()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", 第1570行,在 _register_sharded_tensor_state_dict_hooks_if_available 中
from torch.distributed._shard.sharded_tensor import pre_load_state_dict_hook, state_dict_hook
ModuleNotFoundError: 没有名为 'torch.distributed._shard' 的模块。
I am using CUDA 11.4 and python 3.9.10.
Does anyone know how to fix this?
我正在使用CUDA 11.4和Python 3.9.10。
Does anyone know how to fix this?
有人知道如何修复这个问题吗?
I cannot find anything online that helps, despite searching.
尽管搜索了很多,但我在网上找不到任何有用的信息。
英文:
I'm running some python code that uses the pytorch Lightning framework. I get the error
> File "/Home/LightningVersion.py", line 45, in init
super().init()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 128, in init
self._register_sharded_tensor_state_dict_hooks_if_available()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 1570, in _register_sharded_tensor_state_dict_hooks_if_available
from torch.distributed._shard.sharded_tensor import pre_load_state_dict_hook, state_dict_hook
ModuleNotFoundError: No module named 'torch.distributed._shard'
I am using CUDA 11.4 and python 3.9.10.
Does anyone know how to fix this?
I cannot find anything online that helps, despite searching.
答案1
得分: 3
此模块负责在多个GPU上分片张量,并在PyTorch版本1.8及更高版本中可用。通过运行以下命令来升级您的PyTorch版本到1.8或更高版本:
!pip install torch==1.8.0
英文:
This module is responsible for sharding tensors across multiple GPUs, and it is available in PyTorch versions 1.8 and higher. upgrade your PyTorch version to 1.8 or higher by running
!pip install torch==1.8.0
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论