由于Jetson TK1上不能跑大数据集的训练任务,所以还需要性能更强悍的显卡来进行这种训练任务。所以需要在笔记本上安装ubuntu系统以及cuda和cudnn工具。
笔记本:联想小新-700
CPU:Intel® Core™ i5-6300HQ CPU @ 2.30GHz × 4
GPU:GeForce GTX 950M/PCIe/SSE2
ubuntu系统:16.04
CUDA:8.0
CUDNN:cudnn-8.0-linux-x64-v6.0.tgz
具体的安装过程可以参考Ubuntu16.04+CUDA8.0+caffe配置
- 首先安装nvidia显卡驱动 可在官网上下载对应的驱动程序包安装 也可以直接在系统设置-软件与程序-驱动中选择nvidia的显卡驱动进行更新(一般都是tested),这里我们采用后一种方法直接更新,然后reboot即可
tips:在ubuntu下更新完nvidia的驱动后 需要再次进入boot设置 否则找不到boot点 - reboot完成之后,可以在关于这台计算机中查看是否驱动安装成功 如果成功的话会显示当前采用的是何种nvidia显卡,另外也可以采用命令查询:
sudo nvidia-smi
命令行会输出nvidia显卡的信息:
Wed Apr 19 11:04:15 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39 Driver Version: 375.39 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 950M Off | 0000:01:00.0 Off | N/A |
| N/A 40C P8 N/A / N/A | 105MiB / 2002MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1026 G /usr/lib/xorg/Xorg 64MiB |
| 0 1722 G compiz 34MiB |
| 0 1953 G fcitx-qimpanel 6MiB |
+-----------------------------------------------------------------------------+
- 更新软件包及安装所需软件
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
sudo apt-get install --no-install-recommends libboost-all-dev
sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev
sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
- 接着我们安装cuda8.0 已经下载好了cuda8.0的.run文件
cuda_8.0.61_375.26_linux.run
执行:
sudo sh cuda_8.0.61_375.26_linux.run
一串的文件信息之后输入accept即可,然后对于是否安装nvidia显卡选择否,其他都选择yes或者默认路径即可。然后设置环境变量
打开~/.bashrc文件:
sudo vim ~/.bashrc
将以下内容写入到~/.bashrc尾部:
export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
(4)测试CUDA的sammples
cd /usr/local/cuda-8.0/samples/1_Utilities/deviceQuery #由自己电脑目录决定
make #此处需要root权限
sudo ./deviceQuery
输出为一系列的gpu信息即为安装成功:
Detected 1 CUDA Capable device(s)
Device 0: "GeForce GTX 950M"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 5.0
Total amount of global memory: 2003 MBytes (2100232192 bytes)
( 5) Multiprocessors, (128) CUDA Cores/MP: 640 CUDA Cores
GPU Max Clock rate: 1124 MHz (1.12 GHz)
Memory Clock rate: 1001 Mhz
Memory Bus Width: 128-bit
L2 Cache Size: 2097152 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096)
Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 1 copy engine(s)
Run time limit on kernels: Yes
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 950M
Result = PASS
- 接着配置cudnn加速工具 选择对应于cuda8.0的cudnn-v6.0最新版本
cudnn-8.0-linux-x64-v6.0.tgz
解压可得到一个cuda文件夹里面包含include和lib64两个文件夹,然后按照教材copy即可。
#include文件夹
sudo cp cudnn.h /usr/local/cuda/include/ #复制头文件
#lib64文件夹
sudo cp lib* /usr/local/cuda/lib64/ #复制动态链接库
cd /usr/local/cuda/lib64/sudo rm -rf libcudnn.so libcudnn.so.6 #删除原有动态文件
sudo ln -s libcudnn.so.6.0.20 libcudnn.so.6 #生成软衔接
sudo ln -s libcudnn.so.6 libcudnn.so #生成软链接
- 下载opencv 编译 -- 需要编译一段时间
opencv3.2-source 解压此文件即可
#建立build文件夹#
cd ~/opencv
mkdir build
cd build
#环境配置#
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..
#编译#
make -j 8
#安装#
sudo make install
遇到的问题fata error: LAPACKE_H_PATH-NOTFOUND when building OpenCV 3.2
,解决方法参考fata error: LAPACKE_H_PATH-NOTFOUND when building OpenCV 3.2
sudo apt-get install liblapacke-dev checkinstall
- 编译caffe
make clean
make -j 8 all
make -j 8 runtest
遇到的问题就是"libcudart.so.8.0 cannot open shared object file: No such file or directory"
只需要执行
#注意自己CUDA的版本号!
sudo cp /usr/local/cuda-8.0/lib64/libcudart.so.8.0 /usr/local/lib/libcudart.so.8.0 && sudo ldconfig
sudo cp /usr/local/cuda-8.0/lib64/libcublas.so.8.0 /usr/local/lib/libcublas.so.8.0 && sudo ldconfig
sudo cp /usr/local/cuda-8.0/lib64/libcurand.so.8.0 /usr/local/lib/libcurand.so.8.0 && sudo ldconfig
还遇到了问题"libcudnn.so.6 cannot open shared object file: No such file or directory"
这是因为copy的cudnn中lib文件夹内的该文件没有copy到/usr/local/lib下:
sudo cp /usr/local/cuda/lib64/libcudnn.so.6 /usr/local/lib/libcudnn.so.6 && sudo ldconfig
- 编译的过程中遇到一个问题:
/sbin/ldconfig.real: /usr/lib/nvidia-375/libEGL.so.1 is not a symbolic link
/sbin/ldconfig.real: /usr/lib32/nvidia-375/libEGL.so.1 is not a symbolic link
解决方法参考:libEGL.so.1 不是符号连接
sudo mv /usr/lib/nvidia-375/libEGL.so.1 /usr/lib/nvidia-375/libEGL.so.1.org
sudo mv /usr/lib32/nvidia-375/libEGL.so.1 /usr/lib32/nvidia-375/libEGL.so.1.org
sudo ln -s /usr/lib/nvidia-375/libEGL.so.375.39 /usr/lib/nvidia-375/libEGL.so.1
sudo ln -s /usr/lib32/nvidia-375/libEGL.so.375.39 /usr/lib32/nvidia-375/libEGL.so.1