Ubuntu18.04安装Nvidia驱动以及Tensorflow-GPU

安装时间：2019.2.22

安装配套

当前Tensorflow发行版本为0.12.0，只能配套CUDA 9.X，cudnn 7.x；对应gcc版本需要降级，见下。

安装步骤

禁用ubuntu自带nouveau驱动

删除旧的驱动：

sudo apt-get purge nvidia*

添加文件：

sudo vim /etc/modprobe.d/blacklist-nouveau.conf

并添加如下内容：

blacklist nouveau
options nouveau modeset=0

生效并重启：

sudo update-initramfs -u
sudo reboot

重启后检查是否nouveau是否已被禁用：

lsmod | grep nouveau

没有输出什么东西，说明已经成功关闭了

使用标准Ubuntu 仓库自动化安装Nvidia驱动

检测NVIDIA显卡型号和推荐的驱动程序的模型。在命令行中输入如下命令：

$ ubuntu-drivers devices

== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==

modalias : pci:v000010DEd00001180sv00001458sd0000353Cbc03sc00i00

vendor   : NVIDIA Corporation

model    : GK104 [GeForce GTX 680]

driver   : nvidia-304 - distro non-free

driver   : nvidia-340 - distro non-free

driver   : nvidia-384 - distro non-free recommended

driver   : xserver-xorg-video-nouveau - distro free builtin

== cpu-microcode.py ==

driver   : intel-microcode - distro free

从输出结果可以看到，目前系统已连接Nvidia GeFrand GTX 680显卡，建议安装驱动程序是 nvidia-384版本的驱动。如果您同意该建议，请再次使用Ubuntu驱动程序命令来安装所有推荐的驱动程序。

输入以下命令：

$ sudo ubuntu-drivers autoinstall

一旦安装结束，重新启动系统，你就完成了。

安装CUDA

gcc降级

Ubuntu18.04默认GCC-7.3.0，由于CUDA未支持GCC-7，所以需要安装低版本的5或者<= 6.3.0，并设置为默认版本

sudo apt install gcc-5 g++-5

设置默认版本

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 50 --slave /usr/bin/g++ g++ /usr/bin/g++-5
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70 --slave /usr/bin/g++ g++ /usr/bin/g++-7
sudo update-alternatives --config gcc

此时选择gcc-5为默认版本，输入2：

There are 2 choices for the alternative gcc (providing /usr/bin/gcc).

  Selection    Path            Priority   Status
------------------------------------------------------------
* 0            /usr/bin/gcc-7   70        auto mode
  1            /usr/bin/gcc-5   50        manual mode
  2            /usr/bin/gcc-7   70        manual mode

Press <enter> to keep the current choice[*], or type selection number:

检查默认GCC版本

gcc --version

下载并安装CUDA

cuda下载网站，注意选择cuda版本9.x

sudo执行下载后的.run文件，注意跳过其中的驱动安装环节。

设置环境变量:

echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc

检查

$ nvidia-smi

Sun Apr 29 18:01:43 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48                 Driver Version: 390.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960M    Off  | 00000000:01:00.0 Off |                  N/A |
| N/A   63C    P8    N/A /  N/A |    474MiB /  2004MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1019      G   /usr/lib/xorg/Xorg                            24MiB |
|    0      1119      G   /usr/bin/gnome-shell                          48MiB |
|    0      1380      G   /usr/lib/xorg/Xorg                           155MiB |
|    0      1606      G   /usr/bin/gnome-shell                         162MiB |
|    0      2661      G   ...-token=1EB02DD7413DCE134FBD8EDB1260A396    72MiB |
|    0     10203      G   ...are/jetbrains-toolbox/jetbrains-toolbox     5MiB |
+-----------------------------------------------------------------------------+
$ nvcc -V

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85

确认CUDA工作

找到samples，一般在home目录下

cd ~/NVIDIA_CUDA-9.1_Samples/
make

等待编译完成，

cd ./bin/x86_64/linux/release

使用deviceQuery 或 bandwidthTest测试

$ ./deviceQuery

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 960M"
  CUDA Driver Version / Runtime Version          9.1 / 9.1
  CUDA Capability Major/Minor version number:    5.0
  Total amount of global memory:                 2004 MBytes (2101870592 bytes)
  ( 5) Multiprocessors, (128) CUDA Cores/MP:     640 CUDA Cores
  GPU Max Clock rate:                            1176 MHz (1.18 GHz)
  Memory Clock rate:                             2505 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 2097152 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Supports Cooperative Kernel Launch:            No
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.1, NumDevs = 1
Result = PASS

或

$ ./bandwidthTest

[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GTX 960M
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)    Bandwidth(MB/s)
   33554432         12339.9

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)    Bandwidth(MB/s)
   33554432         11720.0

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)    Bandwidth(MB/s)
   33554432         65699.6

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

安装cudnn

安装并下载cudnn

tar解压得到的tgz文件：


cd Downloads
tar -zxvf cudnn-8.0-linux-x64-v6.0.tgz 
cd cuda/include/ 
sudo cp cudnn.h /usr/local/cuda/include/  #复制头文件 
cd ../lib64    #打开lib64目录 
sudo cp lib* /usr/local/cuda/lib64/    #复制库文件 
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*    #给所有用户增加这些文件的读权限

安装tensorflow-gpu

pip3 install tensorflow-gpu

如果安装成功，则执行下列命令以检查结果：

python3
>> import tensorflow as tf
>> sess = tf.Session()
>> a = tf.constant('hello, world')
>> print(sess.run(a))

输出信息中如包含gpu字样则说明安装成功。

参考links

https://blog.csdn.net/pursuit_zhangyu/article/details/79362128
https://ywnz.com/linuxjc/2162.html
https://blog.csdn.net/ice__snow/article/details/80144503