Ubuntu18.04安装tensorflow-gpu(不使用Docker)
版本设置
tensorflow-gpu:1.14.0、nvidia-driver-418、cuda-10.0
注意:版本搭配,否则会导致各种问题。参考tensorflow官网的版本搭配
首先安装nvidia-driver-418
查看当前显卡驱动信息
lshw -C display | configuration
将nvidia-driver-418 repository添加到apt
#下载cuda deb文件
wget https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu1804/x86_64/cuda-10-0_10.0.130-1_amd64.deb
#根据deb文件构建软件包
sudo dpkg -i cuda-10-0_10.0.130-1_amd64.deb
#获取公钥
sudo apt-key adv --fetch-keys https://developer.download.nvidia.cn/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt update
wget https://developer.download.nvidia.cn/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb
sudo apt update
开始安装驱动
#查看上一步在apt中内建的nvidia driver,注意版本是否为我们需要安装的版本号
ubuntu-drivers devices
#输出为
== /sys/devices/pci0000:00/0000:00:02.0/0000:03:00.0 ==
modalias : pci:v000010DEd00001B84sv00007377sd00000000bc03sc00i00
vendor : NVIDIA Corporation
model : GP104 [GeForce GTX 1060 3GB]
driver : nvidia-driver-410 - third-party free
driver : nvidia-driver-418 - third-party free recommended
driver : nvidia-driver-390 - distro non-free
driver : xserver-xorg-video-nouveau - distro free builtin
#开始安装
sudo ubuntu-drivers autoinstall
#安装完成后重启
sudo reboot
#查看驱动信息
nvidia-smi
#输出信息为
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 106... On | 00000000:03:00.0 On | N/A |
| 36% 37C P8 7W / 120W | 434MiB / 3016MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1072 G /usr/lib/xorg/Xorg 16MiB |
| 0 1138 G /usr/bin/gnome-shell 49MiB |
| 0 1429 G /usr/lib/xorg/Xorg 122MiB |
| 0 1561 G /usr/bin/gnome-shell 155MiB |
| 0 2536 C python3 59MiB |
| 0 3421 G ...quest-channel-token=1415105501360332168 25MiB |
+-----------------------------------------------------------------------------+
驱动安装后安装cuda-10.0
下载cuda runfile文件
从官网https://developer.nvidia.com/cuda-10.0-download-archive下载runfile 文件,如图
安装cuda
下载完成后,运行文件
sudo sh cuda_10.0.130.410.48_linux.run
根据提示进行安装,跳过驱动安装部分。
安装成功后会生成/usr/local/cuda-10.0文件夹
添加环境变量
sudo vim /etc/profile
#添加下面两条语句到文件中
export PATH=/usr/local/cuda-10.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:$LD_LIBRARY_PATH
#重启生效
sudo reboot
#查看cuda 版本
nvcc --version
#输出结果为
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
#至此cuda10.0安装成功
安装cudnn
从官网上下载最新版本的cudnn:https://developer.nvidia.com/rdp/cudnn-archive
注意版本搭配
下载后,进行压缩包放置的文件夹
tar xvzf cudnn-10.1-linux-x64-v7.6.1.34.tgz
sudo cp /cuda/include/* /usr/local/cuda-10.0/include/
sudo cp /cuda/lib64/* /usr/local/cuda-10.0/lib64/
sudo chmod a+r /usr/local/cuda-10.0/include/cudnn.h /usr/local/cuda-10.0/lib64/libcudnn*
安装tensorflow
#安装最稳定版本的tensorflow-gpu,版本号为1.14.0
pip3 install tensorflow-gpu
测试tensorflow
运行任意一个使用到tensoflow的文件,输出结果正确则测试通过。