2023年5月6日 22:13:18go评论82阅读模式

英文:

Having problem with TensorFlow not recognizing my GPU (NVIDIA 4090 RTX)

问题

For some reason, my installation of TensorFlow on an Ubuntu Focal is not recognizing my GPU. Here's the relevant information:

2023-05-05 20:10:21.682174: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-05-05 20:10:21.704546: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-05-05 20:10:21.704838: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-05 20:10:22.134455: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
<CTRL-D>
terminate called after throwing an instance of 'std::runtime_error'
  what():  random_device could not be read
Aborted (core dumped)

It seems there are issues with CUDA drivers and GPU recognition. Additionally, here's your script:

#!/usr/bin/python3
import tensorflow as tf
import yaml
import numpy as np
import IPython.display as ipd
from transformers import pipeline
from tensorflow_tts.inference import TFAutoModel
from tensorflow_tts.inference import AutoConfig
from tensorflow_tts.inference import AutoProcessor

And the output of nvidia-smi:

Sat May  6 07:05:33 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+   
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090         Off| 00000000:41:00.0  On |                  Off |
|  0%   36C    P8               43W / 450W|    301MiB / 24564MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1514      G   /usr/lib/xorg/Xorg                           59MiB |
|    0   N/A  N/A      2384      G   /usr/lib/xorg/Xorg                           91MiB |
|    0   N/A  N/A      2547      G   ...39943991,1614355343741730628,131072      131MiB |
+---------------------------------------------------------------------------------------+

You mentioned trying to update some things and installing packages like nvidia-tensorrt and nvidia-cuda-toolkit. Please let me know if you need further assistance with this issue.

英文:

For some reason my installation of TensorFlow on an Ubuntu Focal is not recognizing my GPU.

testTensorFlowTTS.py

2023-05-05 20:10:21.682174: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-05-05 20:10:21.704546: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-05-05 20:10:21.704838: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-05-05 20:10:22.134455: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
&lt;CTRL-D&gt;
terminate called after throwing an instance of &#39;std::runtime_error&#39;
  what():  random_device could not be read
Aborted (core dumped)

Script is simple test script...

#!/usr/bin/python3
import tensorflow as tf
import yaml
import numpy as np
import IPython.display as ipd
from transformers import pipeline
from tensorflow_tts.inference import TFAutoModel
from tensorflow_tts.inference import AutoConfig
from tensorflow_tts.inference import AutoProcessor

nvidia-smi

Sat May  6 07:05:33 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.41.03              Driver Version: 530.41.03    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4090         Off| 00000000:41:00.0  On |                  Off |
|  0%   36C    P8               43W / 450W|    301MiB / 24564MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      1514      G   /usr/lib/xorg/Xorg                           59MiB |
|    0   N/A  N/A      2384      G   /usr/lib/xorg/Xorg                           91MiB |
|    0   N/A  N/A      2547      G   ...39943991,1614355343741730628,131072      131MiB |
+---------------------------------------------------------------------------------------+

Oh, ran the script above and expected no errors.
Tried updating several tings.. including

python3 -m pip install nvidia-tensorrt
apt-get install nvidia-cuda-toolkit libnvvm

答案1

得分: 1

请通过此 TensorFlow 官方链接来在您的系统上安装 TensorFlow，并按照逐步说明进行操作。

请确保所有系统要求都按链接中指定的方式满足，并设置所有软件的路径。

如果问题仍然存在，请告诉我们。谢谢。

英文:

Please go through this tensorflow official link to install tensorflow in your system and follow the step by step instructions.

Please make sure all the system requirements are satisfied as specified in the link and set the path for all the software.

Let us know if the issue still persists. Thank you.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Having problem with TensorFlow not recognizing my GPU (NVIDIA 4090 RTX)

问题

答案1

从CSV中提取2行并转换为XML。

Csv reader错误地解释了引号。

Dockerfile for Python3 and OpenCV

为什么我的tf.tensor_scatter_nd_add不能像torch.scatter_add_一样工作？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。