为什么 torch.cuda.is_available() 返回false,即使我已经安装了它?

huangapple go评论62阅读模式
英文:

Why does torch.cuda.is_available() return false even though I have it installed?

问题

我有一张英伟达GeForce GTX 1650 Ti显卡。我已经检查了我的显卡与CUDA 12.1的兼容性。

我运行了nvidia-smi命令。以下是结果:

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 531.14                 驱动程序版本: 531.14       CUDA 版本: 12.1     |
|-----------------------------------------+----------------------+----------------------|
| GPU  名称                      TCC/WDDM | 总线 ID        显示.A | 不稳定的 ECC          |
| 风扇  温度  性能            电力:使用/上限|         内存使用情况 | GPU利用率  计算 M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1650 Ti    WDDM | 00000000:01:00.0  关闭 |                  N/A |
| N/A   37C    P8                4W /  N/A|      0MiB /  4096MiB |      0%      默认     |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| 进程:                                                                            |
|  GPU   GI   CI        PID   类型   进程名称                            GPU内存 |
|        ID   ID                                                             使用      |
|=======================================================================================|
|  未找到运行中的进程                                                           |
+---------------------------------------------------------------------------------------+

我刚刚使用以下命令安装了PyTorch:

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch-nightly -c nvidia

但是,出于某种原因,torch.cuda.is_available()返回false

以下是一些更多细节:

  • nvcc --version的结果:
nvcc: NVIDIA (R) Cuda编译器驱动程序
版权所有 (c) 2005-2023 NVIDIA公司
构建于2023年4月3日17:36:15太平洋夏令时间_2023
Cuda编译工具,版本12.1,V12.1.105
构建cuda_12.1.r12.1/compiler.32688072_0
  • 当我运行python -m torch.utils.collect_env时,我得到:
PyTorch 版本: 2.1.0.dev20230608
是否为调试构建: False
用于构建PyTorch的CUDA: 无法收集
用于构建PyTorch的ROCM: N/A

[...]

CUDA是否可用: False
CUDA运行时版本: 12.1.105
CUDA_MODULE_LOADING设置为: N/A
GPU型号和配置: GPU 0: NVIDIA GeForce GTX 1650 Ti
Nvidia驱动程序版本: 531.14

[...]

[conda] pytorch                   2.1.0.dev20230608    py3.10_cpu_0    pytorch-nightly
[conda] pytorch-cuda              12.1                 hde6ce7c_5    pytorch-nightly
[conda] pytorch-mutex             1.0                         cpu    pytorch-nightly
[conda] torchaudio                2.1.0.dev20230608       py310_cpu    pytorch-nightly
[conda] torchvision               0.16.0.dev20230608       py310_cpu    pytorch-nightly
英文:

I have an NVidia GeForce GTX 1650 Ti Graphics card. I have already checked the compatibility of my graphics card with CUDA 12.1.

I ran the nvidia-smi command. Here are the results :

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 531.14                 Driver Version: 531.14       CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                      TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce GTX 1650 Ti    WDDM | 00000000:01:00.0 Off |                  N/A |
| N/A   37C    P8                4W /  N/A|      0MiB /  4096MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

I just installed Pytorch using :

conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch-nightly -c nvidia

But, for some reason, torch.cuda.is_available() returns false.

Here are some more details :

  • Results of nvcc --version :
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:36:15_Pacific_Daylight_Time_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
  • When I run python -m torch.utils.collect_env, I get:
PyTorch version: 2.1.0.dev20230608
Is debug build: False
CUDA used to build PyTorch: Could not collect
ROCM used to build PyTorch: N/A

[...]

Is CUDA available: False
CUDA runtime version: 12.1.105
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA GeForce GTX 1650 Ti
Nvidia driver version: 531.14

[...]

[conda] pytorch                   2.1.0.dev20230608    py3.10_cpu_0    pytorch-nightly
[conda] pytorch-cuda              12.1                 hde6ce7c_5    pytorch-nightly
[conda] pytorch-mutex             1.0                         cpu    pytorch-nightly
[conda] torchaudio                2.1.0.dev20230608       py310_cpu    pytorch-nightly
[conda] torchvision               0.16.0.dev20230608       py310_cpu    pytorch-nightly

答案1

得分: 1

有时在处理具有许多依赖关系的项目时,一个软件包可能会用仅支持CPU的版本覆盖pytorch安装,这会破坏您的环境。

我的建议是通过创建一个全新的环境来隔离问题,以查看pytorch安装是否有问题,或者是否有更深层次的问题。

在一个新的目录中,可以从PowerShell或cmd中运行以下命令:

python -m venv venv
.\venv\Scripts\activate  # Windows
# ./venv/bin/activate  # Unix

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

python -c "import torch; print(torch.cuda.is_available())"

您可以通过运行以下命令来双重检查是否使用了venv中的Python:

which python  # Windows
# where python  # Unix

如果输出仍然显示false,那么您已经确认这不是pytorch的问题。

接下来,请确保您的驱动程序是最新的:https://www.nvidia.com/download/index.aspx

并且您至少安装了CUDA 11:https://developer.nvidia.com/cuda-downloads

也许值得将CUDA的安装版本从12回滚到11,以查看是否能够解决问题。

最后,您可以安慰自己,因为最近不仅您一个人面临这个问题。这可能是最近NVIDIA软件的问题。
1: https://forums.developer.nvidia.com/t/cuda-enabled-geforce-1650/81010/21

英文:

Sometimes when working on projects with many dependencies one package might override the pytorch installation with a CPU only build which will wreck your environment.

My recommendation would be to isolate the problem by creating a totally new environment to see if the pytorch installation is the problem or if it's something deeper.

In a new directory from either PowerShell or cmd run:

python -m venv venv
.\venv\Scripts\activate  # Windows
# ./venv/bin/activate  # Unix

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117

python -c "import torch; print(torch.cuda.is_available())"

You can double check that you are using the python from your venv by running:

which python  # Windows
# where python  # Unix

If the output still says false then you have confirmed that it is not pytorch's fault.

Next make sure your drivers are up to date: https://www.nvidia.com/download/index.aspx

And you have at least CUDA 11 installed: https://developer.nvidia.com/cuda-downloads

It might be worth rolling back your CUDA installation from 12 to 11 to see if that will work.

Finally, you can take solace in the fact that you are not the only one facing this problem recently. It may be the fault of recent NVIDIA software.

huangapple
  • 本文由 发表于 2023年6月9日 01:01:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/76434168.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定