查看显卡信息
nvidia-smi
结果
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:06:00.0 Off | N/A |
| 30% 49C P0 121W / 350W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce ... Off | 00000000:07:00.0 Off | N/A |
| 30% 48C P0 121W / 350W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 2 NVIDIA GeForce ... Off | 00000000:0D:00.0 Off | N/A |
| 30% 49C P0 105W / 350W | 0MiB / 24576MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
| 3 NVIDIA GeForce ... Off | 00000000:0E:00.0 Off | N/A |
| 30% 48C P0 116W / 350W | 0MiB / 24576MiB | 2% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
添加源
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
sudo yum install cuda-12-0.x86_64
出现错误
问题: package nvidia-kmod-common-3:530.30.02-1.el8.noarch from cuda-rhel8-x86_64 requires nvidia-kmod = 3:530.30.02, but none of the providers can be installed
- package kmod-nvidia-530.30.02-4.18.0-425.13.1-3:530.30.02-3.el8_7.x86_64 from cuda-rhel8-x86_64 requires (kernel = 4.18.0-425.13.1.el8_7 if kernel), but none of the providers can be installed
- package kmod-nvidia-530.30.02-4.18.0-425.19.2-3:530.30.02-3.el8_7.x86_64 from cuda-rhel8-x86_64 requires (kernel = 4.18.0-425.13.1.el8_7 if kernel), but none of the providers can be installed
- package kmod-nvidia-530.30.02-4.18.0-477.10.1-3:530.30.02-3.el8_8.x86_64 from cuda-rhel8-x86_64 requires (kernel = 4.18.0-477.10.1.el8_8 if kernel), but none of the providers can be installed
- package kmod-nvidia-530.30.02-4.18.0-477.13.1-3:530.30.02-3.el8_8.x86_64 from cuda-rhel8-x86_64 requires (kernel = 4.18.0-477.13.1.el8_8 if kernel), but none of the providers can be installed
- 安装的软件包的问题 kernel-4.18.0-448.el8.x86_64
- package nvidia-driver-3:530.30.02-1.el8.x86_64 from cuda-rhel8-x86_64 requires nvidia-kmod-common = 3:530.30.02, but none of the providers can be installed
- package cuda-drivers-530.30.02-1.x86_64 from cuda-rhel8-x86_64 requires nvidia-driver >= 3:530.30.02, but none of the providers can be installed
- package cuda-runtime-12-0-12.0.1-1.x86_64 from cuda-rhel8-x86_64 requires cuda-drivers >= 525.85.12, but none of the providers can be installed
- package cuda-12-0-12.0.1-1.x86_64 from cuda-rhel8-x86_64 requires cuda-runtime-12-0 >= 12.0.1, but none of the providers can be installed
- 无法为该任务安装最佳候选
- package cuda-drivers-525.105.17-1.x86_64 from cuda-rhel8-x86_64 is filtered out by modular filtering
- package cuda-drivers-525.85.12-1.x86_64 from cuda-rhel8-x86_64 is filtered out by modular filtering
- nothing provides dkms needed by kmod-nvidia-latest-dkms-3:530.30.02-1.el8.x86_64 from cuda-rhel8-x86_64
- package kmod-nvidia-open-dkms-3:530.30.02-1.el8.x86_64 from cuda-rhel8-x86_64 is filtered out by modular filtering
- nothing provides dkms needed by kmod-nvidia-open-dkms-3:530.30.02-1.el8.x86_64 from cuda-rhel8-x86_64
(尝试添加 '--skip-broken' 来跳过无法安装的软件包 或 '--nobest' 来不只使用软件包的最佳候选)
查看kernel版本
uname -ra
Linux localhost.localdomain 4.18.0-496.el8.x86_64 #1 SMP Mon Jun 5 15:04:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
问题:
经过一番调查, 各种冲突的源头在于
sudo yum remove nvidia-kmod-headers
重新安装
sudo yum install cuda