Getting ModuleNotFoundError: No module named ‘torch.distributed._shard’

huangapple go评论55阅读模式
英文:

Getting ModuleNotFoundError: No module named 'torch.distributed._shard'

问题

I'm running some python code that uses the pytorch Lightning framework. I get the error
> File "/Home/LightningVersion.py", line 45, in init
super().init()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 128, in init
self._register_sharded_tensor_state_dict_hooks_if_available()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 1570, in _register_sharded_tensor_state_dict_hooks_if_available
from torch.distributed._shard.sharded_tensor import pre_load_state_dict_hook, state_dict_hook
ModuleNotFoundError: No module named 'torch.distributed._shard'

我正在运行一些使用pytorch Lightning框架的Python代码。我遇到了以下错误:

> File "/Home/LightningVersion.py", 第45行,在 init
super().init()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", 第128行,在 init
self._register_sharded_tensor_state_dict_hooks_if_available()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", 第1570行,在 _register_sharded_tensor_state_dict_hooks_if_available 中
from torch.distributed._shard.sharded_tensor import pre_load_state_dict_hook, state_dict_hook
ModuleNotFoundError: 没有名为 'torch.distributed._shard' 的模块。

I am using CUDA 11.4 and python 3.9.10.

Does anyone know how to fix this?

我正在使用CUDA 11.4和Python 3.9.10。

Does anyone know how to fix this?

有人知道如何修复这个问题吗?

I cannot find anything online that helps, despite searching.

尽管搜索了很多,但我在网上找不到任何有用的信息。

英文:

I'm running some python code that uses the pytorch Lightning framework. I get the error
> File "/Home/LightningVersion.py", line 45, in init
super().init()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 128, in init
self._register_sharded_tensor_state_dict_hooks_if_available()
File "/Home/.local/lib/python3.9/site-packages/pytorch_lightning/core/module.py", line 1570, in _register_sharded_tensor_state_dict_hooks_if_available
from torch.distributed._shard.sharded_tensor import pre_load_state_dict_hook, state_dict_hook
ModuleNotFoundError: No module named 'torch.distributed._shard'

I am using CUDA 11.4 and python 3.9.10.

Does anyone know how to fix this?

I cannot find anything online that helps, despite searching.

答案1

得分: 3

此模块负责在多个GPU上分片张量,并在PyTorch版本1.8及更高版本中可用。通过运行以下命令来升级您的PyTorch版本到1.8或更高版本:

!pip install torch==1.8.0
英文:

This module is responsible for sharding tensors across multiple GPUs, and it is available in PyTorch versions 1.8 and higher. upgrade your PyTorch version to 1.8 or higher by running

!pip install torch==1.8.0

huangapple
  • 本文由 发表于 2023年4月19日 17:48:08
  • 转载请务必保留本文链接:https://go.coder-hub.com/76053076.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定