Build Personal Deep Learning Rig: GTX 1080 + Ubuntu 16.04 + CUDA 8.0RC + CuDnn 7 + Tensorflow/Mxnet/Caffe/Darknet

转载于  宁广涵的博客

My intern at TCL is over soon. Before going back to the campus for graduation, I have decided to build myself a personal deep learning rig. I guess I cannot really rely on the machines either in the company or in the lab, because ultimately the workstation is not mine, and the development environment may be messed up (It already happened once) . With a personal rig, I can conveniently use teamviewer to login my deep learning workstation at any time. And I got the chance to build everything from scratch.

In this post, I will go through the whole process of the building a deep learning PC, including the hardware and the software. Upon sharing it with you, I hope it will be helpful to researchers and engineers with the same needs. Since I am building the rig withGTX 1080, Ubuntu 16.04, CUDA 8.0RC, CuDnn 7, everything is pretty up-to-date. Here is an overview of this article:

Hardware

1. Pick Parts

2. Build the workstation

Software

3. Operating System Installation

Preparing bootable installation USB drives

Build systems

4. Deep Learning Environment Installation

Remote Control

teamviewer

Bundle Management

anaconda

Development Environment

python IDE

GPU-optimized Environment

CUDA

CuDnn

Deep Learning Frameworks

Tensorflow

Mxnet

Caffe

Darknet

5. Docker for Out-of-the-Box Deep Learning Environment

Install Docker

Install NVIDIA-Docker

Download Deep Learning Docker Images

Share Data between Host and Container

Learn Simple Docker Commands

Hardware:

1. Pick Parts

I recommend usingPcPartPickerto pick your parts. It helps you find the source where you can buy your part with the lowest price available, and it checks the compatibility of the selected parts for you. They also have ayoutube channelwhere they offer videos that demonstrate the building process.

In my case, I used their build article as reference, and created a build list myself, which can be found [here]. Here are the parts that I used to build the workstation.

Since we are doing deep learning research, a good GPU is necessary. Therefore, I choose the recently released GTX 1080. It was quite hard to buy, but if you notice the bundles in newegg, some people are gathering this to sell in [GPU + motherboard] or [GPU + Power] bundles. Market, you know. It is better buying the bundle than buying it at a raised price, though. Anyway, a good GPU will make the training or finetuning process much faster. Here are some figures to show the advantage of GTX 1080 over some other GPUs, with respect to performance, price, and power efficiency (saves you electricity daily and the money to buy the appropriate PC power supply).

Note that GTX 1080 has only 8GB memory, compared to 12GB of TITAN X. You may be richer or more generous to yourself, therefore considering using stacked GPUs. Then remember to choose another motherboard that has more PCIs.

2. Build from Parts

As of building from the parts, I followed the tutorial of [this video]. Although the parts are slightly different, the building process is quite similar. I have no previous experience building by my own, but with this tutorial I was able to make it work within 3 hours. (It should take you less time, but I was extremely cautious, you know.)

Software:

3. Installation of Operating Systems

It is very common to use Ubuntu for deep learning research. But sometimes you may need another operating system working as well. For example, if you are also a VR developer, having a GTX 1080, you may want a Win10 for VR development with Unity or whatsoever. Here I introduce the installation of both Win10 and Ubuntu. If you are only interested in the Ubuntu installation, you can skip installing windows.

3.1 Preparing bootable installation USB drives

It is very convenient to install operating systems with USB disks, as we all have them. Because the USB disks will be formatted, you won’t want that happen to your portable hard disk. Or if you have writable DVDs, you can use them to install operating systems and save them for future use, if you can find them again by then.

Since it is well illustrated in the official website, you can go to [Windows 10 page] to learn how to make the USB drive.As of Ubuntu, you can similarly download the ISO and create USB installation media or burn it to a DVD. If you are now using Ubuntu system, follow [this tutorial] from Ubuntu official website. If you are current using Windows, follow [this tutorial] istead.

3.2 Installing Systems

It is highly recommended that you install windows first for a dual-system installation. I will skip win10 installation as detailed guide can be found here: [Windows 10 page] . One thing to note is that you will need the activation key. You can find the tag on the bottom of your laptop, if it has been installed windows 7 or windows 10 upon purchasing.

Installing Ubuntu16.04 was a little tricky for me, which was kind of a surprise. It was mainly because I did not have the GTX 1080 driver pre-installed at the very beginning. I will share my story with you, in case you encounter the same problems.

Installing Ubuntu

First things first, insert the boot USB for installation. Nothing is showing on my LG screen, except that it says frequency is too high. But the screen is okay, as is tested on another laptop. I tried to connect the PC with a TV, which was showing, but only the desktop with no tool panel. I figured out it was the problem of the NVIDIA driver. So I went to BIOS and set the integrated graphics as default and restart. Remember to switch the HDMI from the port on GTX1080 to that on the motherboard. Now the display works well. I successfully installed Ubuntu following its prompt guides.

In order to use GTX1080, go to [this page] to get the NVIDIA display driver for Ubuntu. Upon installing this driver, make sure that GTX1080 is on the motherboard.

Now it shows “You appear to be running an X server.. “.  I followedthis linkto solve this problem and installed the driver. I quote it here:

Make sure you are logged out.

Hit CTRL+ALT+F1and login using your credentials.

kill your current X server session by typing sudo service lightdm stopor sudo stop lightdm

Enter runlevel 3 by typing sudo init 3and install your *.run file.

You might be required to reboot when the installation finishes. If not, run sudo service lightdm startor sudo start lightdm to start your X server again.

After installing the driver, we can now restart and set the GTX1080 as default in BIOS. We are good to go.

Some other small problems I encountered are listed here, in case they are helpful:

Problem: When I restart, I couldn’t find the option to choose windows.

Solver: In ubuntu,sudo gedit /boot/grub/grub.cfg, add following lines:

menuentry ‘Windows 10′{

set root=’hd0,msdos1′

chainloader +1

}

Problem: Ubuntu does not support wireless adapter Belkin N300, which is commonly sold in Bestbuy.

Solver: Follow instructions inthis link, the problem will be solved.

Problem: Upon installing teamviewer, it says “dependencies not met”

Solver: Refer tothis link.

4. Deep Learning Environment

4.1 Installation of Remote Control (TeamViewer):

dpkg -i teamviewer_11.0.xxxxx_i386.deb

4.2 Installation of Package Management System (Anaconda):

Anaconda is an easy-to-install free package manager, environment manager, Python distribution, and collection of over 720 open source packages offering free community support.It can be used to create virtual environments, where each environment will not mess up with each other. It is helpful when we use different deep learning frameworks at the same time, and the configurations are different.Using it to install packages is convenient as well.It can be easily installed,follow this.

Some commands to start using virtual environment:

source activate [virtualenv]

source deactivate

4.3 Installation of Development Environment (Python IDE):

4.3.1 Spyder vs Pycharm?

Spyder

Advantage:matlab-like,easy to review intermediate results.

Pycharm:

Advantage:modular coding,more complete IDE for web development frameworks and cross-platform.

In my personal philosophy, I regard them to be merely tools. Each tool will be used when it comes in handy. I will use IDEs for the construction of the backbone for the project. For example, use pycharm for the framework construction. After that, I will just modify code with VIM. It is not that VIM is so powerful and showy, but because it is the single text editor that I want to really master. As of text editors, there is no need we should master two. For special occasions, where we need to frequently check IO, directories, etc, we might want to use spyder instead.

4.3.2 Installation:

spyder:

You do not need to install spyder, as is included in anaconda.

pycharm

Download from theofficial website. Just unzip.

Set anaconda to be the project interpreter for pycharm, dealing with package management. Followthis.

vim

sudo apt-get install vim

The configuration that I currently use:Github

Git

sudo apt install git

git config –global user.name “Guanghan Ning”

git config –global user.email “guanghan.ning@gmail.com”

git config –global core.editor vim

git config –list

4.4 Installation of GPU-Optimized Computing Environment (CUDA and CuDNN)

4.4.1 CUDA

Install CUDA 8.0 RC

There are two reasons to choose the 8.0 version over 7.5:

CUDA 8.0 will give a performance gain for GTX1080 (Pascal), compared to CUDA 7.5.

It seems that ubuntu 16.04 does not support CUDA 7.5 because you cannot find it to download on the official website. Therefore CUDA 8.0 is the only choice.

CUDA starter Guide

CUDA Installer Guide

sudo sh cuda_8.0.27_linux.run

Follow the command-line prompts

As part of the CUDA environment, you should add the following in the ~/.bashrcfile of your home folder.

export CUDA_HOME=/usr/local/cuda-8.0

export LD_LIBRARY_PATH=${CUDA_HOME}/lib64

PATH=${CUDA_HOME}/bin:${PATH}

export PATH

check if CUDA is installed (Remember to restart the terminal):

nvcc –version

4.4.2 Cudnn(CUDA Deep Learning Libarary)

install cudnn

Version:Cudnn v5.0 for CUDA 8.0RC

User Guide

Install Guide

Choice one: (Add CuDNN path to environment variables)

Extract folder “cuda”

cd

export LD_LIBRARY_PATH=`pwd`:$LD_LIBRARY_PATH

Choice two:  (Copy the files of CuDNN to CUDA folder. If CUDA is working alright, it will automatically find CUDNN by relative path)

tar xvzf cudnn-8.0.tgz

cd cudnn

sudo cp include/cudnn.h /usr/local/cuda/include

sudo cp lib64/libcudnn* /usr/local/cuda/lib64

sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

4.5 Installation of Deep Learning Frameworks:

4.5.1 Tensorflow / keras

Install tensorflow first

Install with anaconda

conda create -n tensorflow python=3.5

Install Tensorflow using Pip in the environment(It does NOT supports cuda 8.0 at the moment. I will update this when binaries for CUDA 8.0 come out)

source activate tensorflow

sudo apt install python3-pip

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0-cp35-cp35m-linux_x86_64.whl

pip3 install –upgrade $TF_BINARY_URL

Install Tensorflow directly from source

install bazel

install jdk 8

uninstall jdk 9

sudo apt-get install python-numpyswigpython-dev

./configure

build with bazel

bazel build -c opt –config=cuda //tensorflow/cc:tutorials_example_trainer

bazel-bin/tensorflow/cc/tutorials_example_trainer –use_gpu

Install keras

download it at:https://github.com/fchollet/keras/tree/master/keras

cd to the Keras folder and run the install command:

sudo python setup.py install

Change the default backendfrom theano to tensorflow

Use conda to activate/deactivate the virtual environment

source activate tensorflow

source deactivate

4.5.2 Mxnet

Create a virtual environment for Mxnet

conda create -n mxnet python=2.7

source activate mxnet

Follow theofficial websiteto install mxnet

sudo apt-get update

sudo apt-get install -y build-essential git libatlas-base-dev libopencv-dev

git clone –recursivehttps://github.com/dmlc/mxnet

edit make/config.mk

set cuda= 1, set cudnn= 1, add cuda path

cd mxnet

make clean_all

make -j4

One problem I encountered was,”gcc version later than 5.3 not supported!” My gcc version was 5.4, and I had to remove it.

apt-get remove gcc g++

conda install -c anaconda gcc=4.8.5

gcc –version

Python package installfor mxnet

conda install -c anaconda numpy=1.11.1

Method 1:

cd python; sudo python setup.py install

sudo apt-get install python-setuptools

Method 2:

cd mxnet

cp -r ../mxnet/python/mxnet .

cp ../mxnet/lib/libmxnet.so mxnet/

Quick test:

python example/image-classification/train_mnist.py

GPU test:

python example/image-classification/train_mnist.py –network lenet –gpus 0

4.5.3 Caffe

Follow this detailed guide:Caffe Ubuntu 16.04 or 15.10 Installation Guide

OpenCV is needed. For Installation of Opencv 3.1, refer to this link:Ubuntu 16.04 or 15.10 OpenCV 3.1 Installation Guide

4.5.4 Darknet

This is the easiest of all to install. Just type “make”, and that’s it.

5. Docker for Out-of-the-Box Deep Learning Environment

I used to have caffe, darknet, mxnet, tensorflow all installed correctly in Ubuntu 14.04 and TITAN-X (cuda7.5). And I have done projects with these frameworks, all turning out working well. It is therefore safer to use these pre-built environments than adventuring with latest versions, if you want to focus on the deep learning research instead of being potentially bothered by peripheral problems you may encounter. Then you should consider isolate each framework with its own environment using docker. These docker images can be found inDockerHub.

5.1 Install Docker

Unlike virtual machines, a docker image is built with layers. Same ingredients are shared among different images. When we download a new image, existing components won’t be re-downloaded. It is more efficient and convenient compared to the replacement of the whole virtual machine image. Docker containers are like the run-time of docker images. They can be committed and used to update docker images, just like Git.

To install docker on Ubuntu 16.04, we follow instructions onthe official website.

5.2 InstallNVIDIA-Docker

Docker containers are both hardware-agnostic and platform agnostic, but docker does not natively support NVIDIA GPUs with containers. (The hardware is specialized, and driver is needed.) To solve this problem, we need the nvidia-docker to mount the devices and driver files when starting the container on the target machine. In this way, the image is agnostic of the Nvidia driver.

The installation of NVIDIA-Docker can be foundhere.

5.3 Download Deep Learning Docker Images

I have collected some pre-built docker images from the Docker Hub. They are listed here:

cuda-caffe

cuda-mxnet

cuda-keras-tensorflow-jupyter

More can be easily found on docker hub.

5.4 Share Data between Host and Container

For computer vision researchers, it will be awkward not to see results.For instance, after adding some Picasso style to an image, we would definitely want to the output images from different epoches.Check outthis pagequickly to share data between the host and the container.In a shared directory, we can create projects. On the host, we can start coding with text editors or whatever IDEs we prefer. And then we can run the program in the container.The data in the shared container can be viewed and processed with the GUI of the host Ubuntu machine.

5.5 Learn Simple Docker Commands

Don’t be overwhelmed  if you are new to docker. It does not need to be systematically studied unless you want to in the future.Here are some simple commands for you to use to start dealing with docker. Usually they are sufficient if you consider Docker a tool, and want to use it solely for a deep learning environment.

How to check the docker images?

docker images: Check all the docker images that you have.

How to check the docker containers?

docker ps -a:Check all the containers that you have.

docker ps: Check containers that are running

How to exit a docker container?

(Method 1) In the terminal corresponding the current container:

exit

(Method 2) Use [Ctrl + Alt + T] to open a new terminal, or use [Ctrl + Shift + T] to open a new terminal tab:

docker ps -a:Check the containers you have.

docker ps: Check the running container(s).

docker stop [container’s ID]: Stop this container.

How to remove a docker image?

docker rmi [docker_image_name]

How to remove a docker container?

docker rm [docker_container_name]

How to create our own docker image, based on one that is from someone else?(Update a container created from an image and commit the results to an image.)

load image,open a container

do some changes in the container

commit to the image

docker commit -m “Message: Added changes” -a “Author: Guanghan”  0b2616b0e5a8 ning/cuda-mxnet

Copy data between host and the docker container:

docker cp foo.txt mycontainer:/foo.txt

docker cp mycontainer:/foo.txt foo.txt

Open a container from a docker image:

If the container is to be saved because it is probably to be committed:

docker run -it [image_name]

If the container is only for temporary use:

docker run –rm -it [image_name]

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,185评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,445评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 157,684评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,564评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,681评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,874评论 1 290
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,025评论 3 408
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,761评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,217评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,545评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,694评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,351评论 4 332
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,988评论 3 315
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,778评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,007评论 1 266
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,427评论 2 360
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,580评论 2 349

推荐阅读更多精彩内容

  • PLEASE READ THE FOLLOWING APPLE DEVELOPER PROGRAM LICENSE...
    念念不忘的阅读 13,444评论 5 6
  • 2015年我们从不同的地方,为了同一个梦想,相聚在一个没有空调,没有独立书桌,没有独立洗澡房,窄窄的小宿舍,那时我...
    青橙梓阅读 315评论 1 3
  • 一纸公证书引发的撕逼 当北京方圆公证处出具的,关于小米与乐视视频资源对比的公证书一出,一贯作风高调、公关手段强硬的...
    Meen阅读 404评论 0 1
  • 难得熬到一个接近12点的晚上,突然没由来泛起了兴致。听同学说今晚的月色极好看,一轮满月的背后还有隐隐约约的光环浮现...
    三木籽阅读 312评论 0 1
  • 路 只有一条 命运 在自己手中 洗尽铅华,返璞归真 绚烂之极,归于心静 潜默修行,梦起于微 想现于实,努力前行 掩...
    守护者夜月星阅读 495评论 0 0