Having problem with TensorFlow not recognizing my GPU (NVIDIA 4090 RTX)

huangapple go评论82阅读模式
英文:

Having problem with TensorFlow not recognizing my GPU (NVIDIA 4090 RTX)

问题

For some reason, my installation of TensorFlow on an Ubuntu Focal is not recognizing my GPU. Here's the relevant information:

  1. 2023-05-05 20:10:21.682174: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
  2. 2023-05-05 20:10:21.704546: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
  3. 2023-05-05 20:10:21.704838: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
  4. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  5. 2023-05-05 20:10:22.134455: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
  6. <CTRL-D>
  7. terminate called after throwing an instance of 'std::runtime_error'
  8. what(): random_device could not be read
  9. Aborted (core dumped)

It seems there are issues with CUDA drivers and GPU recognition. Additionally, here's your script:

  1. #!/usr/bin/python3
  2. import tensorflow as tf
  3. import yaml
  4. import numpy as np
  5. import IPython.display as ipd
  6. from transformers import pipeline
  7. from tensorflow_tts.inference import TFAutoModel
  8. from tensorflow_tts.inference import AutoConfig
  9. from tensorflow_tts.inference import AutoProcessor

And the output of nvidia-smi:

  1. Sat May 6 07:05:33 2023
  2. +---------------------------------------------------------------------------------------+
  3. | NVIDIA-SMI 530.41.03 Driver Version: 530.41.03 CUDA Version: 12.1 |
  4. |-----------------------------------------+----------------------+----------------------+
  5. | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
  6. | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
  7. | | | MIG M. |
  8. |=========================================+======================+======================|
  9. | 0 NVIDIA GeForce RTX 4090 Off| 00000000:41:00.0 On | Off |
  10. | 0% 36C P8 43W / 450W| 301MiB / 24564MiB | 1% Default |
  11. | | | N/A |
  12. +-----------------------------------------+----------------------+----------------------+
  13. +---------------------------------------------------------------------------------------+
  14. | Processes: |
  15. | GPU GI CI PID Type Process name GPU Memory |
  16. | ID ID Usage |
  17. |=======================================================================================|
  18. | 0 N/A N/A 1514 G /usr/lib/xorg/Xorg 59MiB |
  19. | 0 N/A N/A 2384 G /usr/lib/xorg/Xorg 91MiB |
  20. | 0 N/A N/A 2547 G ...39943991,1614355343741730628,131072 131MiB |
  21. +---------------------------------------------------------------------------------------+

You mentioned trying to update some things and installing packages like nvidia-tensorrt and nvidia-cuda-toolkit. Please let me know if you need further assistance with this issue.

英文:

For some reason my installation of TensorFlow on an Ubuntu Focal is not recognizing my GPU.

  1. testTensorFlowTTS.py
  1. 2023-05-05 20:10:21.682174: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
  2. 2023-05-05 20:10:21.704546: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
  3. 2023-05-05 20:10:21.704838: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
  4. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  5. 2023-05-05 20:10:22.134455: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
  6. &lt;CTRL-D&gt;
  7. terminate called after throwing an instance of &#39;std::runtime_error&#39;
  8. what(): random_device could not be read
  9. Aborted (core dumped)

Script is simple test script...

  1. #!/usr/bin/python3
  2. import tensorflow as tf
  3. import yaml
  4. import numpy as np
  5. import IPython.display as ipd
  6. from transformers import pipeline
  7. from tensorflow_tts.inference import TFAutoModel
  8. from tensorflow_tts.inference import AutoConfig
  9. from tensorflow_tts.inference import AutoProcessor
  1. nvidia-smi
  1. Sat May 6 07:05:33 2023
  2. +---------------------------------------------------------------------------------------+
  3. | NVIDIA-SMI 530.41.03 Driver Version: 530.41.03 CUDA Version: 12.1 |
  4. |-----------------------------------------+----------------------+----------------------+
  5. | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
  6. | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
  7. | | | MIG M. |
  8. |=========================================+======================+======================|
  9. | 0 NVIDIA GeForce RTX 4090 Off| 00000000:41:00.0 On | Off |
  10. | 0% 36C P8 43W / 450W| 301MiB / 24564MiB | 1% Default |
  11. | | | N/A |
  12. +-----------------------------------------+----------------------+----------------------+
  13. +---------------------------------------------------------------------------------------+
  14. | Processes: |
  15. | GPU GI CI PID Type Process name GPU Memory |
  16. | ID ID Usage |
  17. |=======================================================================================|
  18. | 0 N/A N/A 1514 G /usr/lib/xorg/Xorg 59MiB |
  19. | 0 N/A N/A 2384 G /usr/lib/xorg/Xorg 91MiB |
  20. | 0 N/A N/A 2547 G ...39943991,1614355343741730628,131072 131MiB |
  21. +---------------------------------------------------------------------------------------+

Oh, ran the script above and expected no errors.
Tried updating several tings.. including

  1. python3 -m pip install nvidia-tensorrt
  2. apt-get install nvidia-cuda-toolkit libnvvm

答案1

得分: 1

请通过此 TensorFlow 官方链接来在您的系统上安装 TensorFlow,并按照逐步说明进行操作。

请确保所有系统要求都按链接中指定的方式满足,并设置所有软件的路径。

如果问题仍然存在,请告诉我们。谢谢。

英文:

Please go through this tensorflow official link to install tensorflow in your system and follow the step by step instructions.

Please make sure all the system requirements are satisfied as specified in the link and set the path for all the software.

Let us know if the issue still persists. Thank you.

huangapple
  • 本文由 发表于 2023年5月6日 22:13:18
  • 转载请务必保留本文链接:https://go.coder-hub.com/76189357.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定