Ubuntu18.04安装Nvidia驱动以及Tensorflow-GPU
安装时间:2019.2.22
安装配套
当前Tensorflow发行版本为0.12.0,只能配套CUDA 9.X,cudnn 7.x;对应gcc版本需要降级,见下。
安装步骤
禁用ubuntu自带nouveau驱动
删除旧的驱动:
sudo apt-get purge nvidia*
添加文件:
sudo vim /etc/modprobe.d/blacklist-nouveau.conf
并添加如下内容:
blacklist nouveau
options nouveau modeset=0
生效并重启:
sudo update-initramfs -u
sudo reboot
重启后检查是否nouveau是否已被禁用:
lsmod | grep nouveau
没有输出什么东西,说明已经成功关闭了
使用标准Ubuntu 仓库自动化安装Nvidia驱动
检测NVIDIA显卡型号和推荐的驱动程序的模型。在命令行中输入如下命令:
$ ubuntu-drivers devices
== /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
modalias : pci:v000010DEd00001180sv00001458sd0000353Cbc03sc00i00
vendor : NVIDIA Corporation
model : GK104 [GeForce GTX 680]
driver : nvidia-304 - distro non-free
driver : nvidia-340 - distro non-free
driver : nvidia-384 - distro non-free recommended
driver : xserver-xorg-video-nouveau - distro free builtin
== cpu-microcode.py ==
driver : intel-microcode - distro free
从输出结果可以看到,目前系统已连接Nvidia GeFrand GTX 680显卡,建议安装驱动程序是 nvidia-384版本的驱动。如果您同意该建议,请再次使用Ubuntu驱动程序命令来安装所有推荐的驱动程序。
输入以下命令:
$ sudo ubuntu-drivers autoinstall
一旦安装结束,重新启动系统,你就完成了。
安装CUDA
gcc降级
Ubuntu18.04默认GCC-7.3.0,由于CUDA未支持GCC-7,所以需要安装低版本的5或者<= 6.3.0,并设置为默认版本
sudo apt install gcc-5 g++-5
设置默认版本
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 50 --slave /usr/bin/g++ g++ /usr/bin/g++-5
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70 --slave /usr/bin/g++ g++ /usr/bin/g++-7
sudo update-alternatives --config gcc
此时选择gcc-5为默认版本,输入2:
There are 2 choices for the alternative gcc (providing /usr/bin/gcc).
Selection Path Priority Status
------------------------------------------------------------
* 0 /usr/bin/gcc-7 70 auto mode
1 /usr/bin/gcc-5 50 manual mode
2 /usr/bin/gcc-7 70 manual mode
Press <enter> to keep the current choice[*], or type selection number:
检查默认GCC版本
gcc --version
下载并安装CUDA
sudo执行下载后的.run文件,注意跳过其中的驱动安装环节。
设置环境变量:
echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
检查
$ nvidia-smi
Sun Apr 29 18:01:43 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.48 Driver Version: 390.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 960M Off | 00000000:01:00.0 Off | N/A |
| N/A 63C P8 N/A / N/A | 474MiB / 2004MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1019 G /usr/lib/xorg/Xorg 24MiB |
| 0 1119 G /usr/bin/gnome-shell 48MiB |
| 0 1380 G /usr/lib/xorg/Xorg 155MiB |
| 0 1606 G /usr/bin/gnome-shell 162MiB |
| 0 2661 G ...-token=1EB02DD7413DCE134FBD8EDB1260A396 72MiB |
| 0 10203 G ...are/jetbrains-toolbox/jetbrains-toolbox 5MiB |
+-----------------------------------------------------------------------------+
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
确认CUDA工作
找到samples,一般在home目录下
cd ~/NVIDIA_CUDA-9.1_Samples/
make
等待编译完成,
cd ./bin/x86_64/linux/release
使用deviceQuery 或 bandwidthTest测试
$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 960M"
CUDA Driver Version / Runtime Version 9.1 / 9.1
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2004 MBytes (2101870592 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1176 MHz (1.18 GHz)
Memory Clock rate: 2505 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Supports Cooperative Kernel Launch: No
Supports MultiDevice Co-op Kernel Launch: No
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.1, CUDA Runtime Version = 9.1, NumDevs = 1
Result = PASS
或
$ ./bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...
Device 0: GeForce GTX 960M
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 12339.9
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 11720.0
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 65699.6
Result = PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
安装cudnn
tar解压得到的tgz文件:
cd Downloads
tar -zxvf cudnn-8.0-linux-x64-v6.0.tgz
cd cuda/include/
sudo cp cudnn.h /usr/local/cuda/include/ #复制头文件
cd ../lib64 #打开lib64目录
sudo cp lib* /usr/local/cuda/lib64/ #复制库文件
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn* #给所有用户增加这些文件的读权限
安装tensorflow-gpu
pip3 install tensorflow-gpu
如果安装成功,则执行下列命令以检查结果:
python3
>> import tensorflow as tf
>> sess = tf.Session()
>> a = tf.constant('hello, world')
>> print(sess.run(a))
输出信息中如包含gpu字样则说明安装成功。
参考links
https://blog.csdn.net/pursuit_zhangyu/article/details/79362128
https://ywnz.com/linuxjc/2162.html
https://blog.csdn.net/ice__snow/article/details/80144503