安装tensorflow的过程中遇到了很多大大小小的问题,下面总结一些自己的安装流程,也算是对自己在这方面的一个总结。
centos安装常用插件
yum -y install zip unzip vim git lrzsz wget gcc gcc-c++ python-devel java
安装jdk8
- 使用java -version,如果你之前已经安装了JAVA1.6或1.7的版本,请执行下列命令,将它们卸载。
- yum remove java
- 下载jdk相对应的版本
- 通过rpm -ivh name.rpm进行安装
python安装pip
- wget https://bootstrap.pypa.io/get-pip.py --no-check-certificate
- python get-pip.py
安装bazel
下载bazel源码
-
方式一(推荐):
(1)直接下载bazel的对应的版本,这里对应的版本为0.4.5,详情参考
(2)cd bazel
(3)./compile.sh
(4)将执行路径/root/bazel-0.4.5-dist/output/ 添加到 $PATH 环境变量中- export BAZELHOME=/root/bazel-0.4.5-dist/output(这个路径安装完之后会有提示)
- export PATH=$PATH:$BAZELHOME
-
方式二:
(1)git clone https://github.com/bazelbuild/bazel.git
(2)cd bazel
(3)git checkout tags/0.1.0
(4)./compile.sh
(5)将执行路径/root/bazel-0.4.5-dist/output/ 添加到 $PATH 环境变量中- export BAZELHOME=/root/bazel-0.4.5-dist/output
- export PATH=
$PATH:$BAZELHOME
按esc再按shift+:输入wq写入,然后source /etc/profile生效
bazel -h 查看bazel信息
安装遇到的问题
Cannot find gcc, either correct your path or set the CC environment variable.(通过yum install gcc进行安装)
ERROR: /root/Configuration/bazel-0.4.5-dist/src/main/tools/BUILD:3:1: undeclared inclusion(s) in rule '//src/main/tools:process-wrapper'(通过pip install process-wrapper安装)
which: no git in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin(通过yum install git安 装)
gcc: error trying to exec 'cc1plus': execvp: No such file or directory(通过yum install gcc-c++ 安装,安装过程中包含一个重要的库libstdc++-devel-4.8.5-11.el7.x86_64.rpm)
/usr/local/aiconfiguration/bazel-0.4.5/src/BUILD:129:1: Executing genrule //src:embedded_tools failed: bash failed: error executing command (cd /tmp/bazel_xujO1WJE/out/execroot/bazel-0.4.5 &&
exec env - \PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin
…
src/create_embedded_tools.sh: line 85: zip: command not found(这里需要特别注意,最后一句是关键,提示zip没有发现,通过yum install zip解决)
安装numpy、scipy、Pillow(进入python通过help('name')查看,如果没有则安装)
pip install numpy || pip install scipy || pip install pillow
支持GPU(工具包下载安装)
- Cuda Toolkit安装
- 下载对应的型号
- rpm -i name.rpm
- sudo yum clean all
- sudo yum install cuda
- 安装遇到的问题
需要安装dkms:(1)下载dkms文件 (2)执行:rpm -Uvh dkms-2.2.0.3-31.1.noarch.rpm - Cudnn安装
-下载 Cudnn Toolkit包- tar xvzf cudnn-6.5-linux-x64-v2.tgz
- cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include
- cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64
安装NVIDIA驱动
- 下载对应的驱动版本,注意版本、平台和硬件型号,例如.run包
- 通过sh name.run命令进行安装
- 安装遇到的问题
- ERROR: Unable to find the kernel source tree for the currently running kernel. Please make sure you have installed the kernel source files for your kernel and that they are properly configured; on Red Hat Linux systems, for example, be sure you have the 'kernel-source' or 'kernel-devel' RPM installed. If you know the correct kernel source files are installed, you may specify the kernel source path with the '--kernel-source-path' command line option
- present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA graphics device(s), or no NVIDIA GPU installed in this system is supported by this NVIDIA Linux graphics driver release.Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.
- 解决方法
- 这个问题是因为通过yum安装kernel-devel的版本号问题,如果通过yum install kernel-devel安装可能造成版本不支持,如果出现了上 述问题,可以通过yum -y install kernel-devel "kernel-devel-uname-r == $(uname -r)"命令进行安装。
- 如果通过第一个步骤重新运行sh name.run还会提示“you may specify the kernel source path with the '--kernel-source- path'”,那么你可以通过sh name.run -kernel-source-path=/usr/src/kernels/新装的kernel-devel版本号(通过uname -r查询)进行安装
安装tensorflow[注意命令源码下划线]
通过tensorflow源码安装
方式一(推荐):
(1)直接下载tensorflow的对应的版本,这里对应的版本为1.0.1(Source code(tar.gz))
(2)cd tensorflow
(3)./configure
(4)bazel build --config=cuda //tensorflow/tools/pip_package:build_pip_package(with CPU-only support:--config=opt or with GPU support:--config=cuda)
(5)bazel-bin/tensorflow/tools/pip_package/build_pip_package
/tmp/tensorflow_pkg(执行完这一句会输入一个路径,这个路径里面包含了一个生成的.whl文件)
(6)pip install /tmp/tensorflow_pkg/tensorflow-1.0.1-*.whl方式二:
(1)git clone https://github.com/tensorflow/tensorflow
(2)cd tensorflow
(3)git checkout r1.0
(4)./configure
(5)bazel build --config=opt
//tensorflow/tools/pip_package:build_pip_package(with CPU-only support:--config=opt or with GPU support:--config=cuda)
(6))bazel-bin/tensorflow/tools/pip_package/build_pip_package
/tmp/tensorflow_pkg(执行完这一句会输入一个路径,这个路径里面包含了一个生成的.whl文件)
(7)pip install /tmp/tensorflow_pkg/tensorflow-1.0.1-*.whl进入python,通过help('tensorflow')查看相应信息
-
安装需要注意的问题(支持GPU)
- 用configure脚本来配置环境信息的时候需要填[y/n]的时候需要注意了,因为默认是填n的。
- 注意TensorFlow的./configure 设置bazel option
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]: --config=cuda - Dou you wish to build Tensorflow with GPU support?[y/n]这个需要填y,默认是填n
- Do you wish to build TensorFlow with OpenCL support? [y/N] 如果这个选择了y就会出现选择clang、clang++的默认路径,同 时也会出现Please specify the location where ComputeCpp for SYCL 1.2 is installed. [Default is /usr/local/ computecpp]: Invalid SYCL 1.2 library path. /usr/local/computecpp/lib/libComputeCpp.so cannot be found问题, openCL和cuda没有什么直接的联系,所以这个选项可以设置为n。
- Please specify the location where CUDA 7.0 toolkit is installed. Refer toREADME.md for more details. [default is: /usr/local/cuda]:这个路径根据cuda安装的路径来设置
- Please specify the location where CUDNN 6.5 V2 library is installed. Refer toREADME.md for more details. [default is: /usr/local/cuda]: 这个路径根据cuda安装的路径来设置
- Please specify a list of comma-separated Cuda compute capabilities you want to build with.
You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
Please note that each additional compute capability significantly increases your build time and binary size [Default is: "3.5,5.2"]: 6.1(这个值是要根据自己设备的,可参考中的CUDA-Enabled GeForce Products)
测试Tensorflow
进入python
输入import tensorflow as tf
输入hello = tf.constant('Hello, TensorFlow!')
输入sess = tf.Session()
输入print sess.run(hello)
输出Hello, TensorFlow!
输入a = tf.constant(10)
输入b = tf.constant(32)
输入print sess.run(a+b)
输出42可能遇到的问题
提示:failed call to cuInit: CUDA_ERROR_UNKNOWN driver does not appear to be running on this host (localhost.localdomain): /proc/driver/nvidia/version does not exist,这个是因为源码安装的时候选择了GPU支持 ,所以需要安装NVIDIA驱动,具体安装流程请看本教程的NVIDIA驱动安装模块。
安装ffmpeg
-
方式一(ffmpeg源码安装,推荐)
- 下载相对应的版本
- ./configure --prefix=/usr [设置安装路径]
- make
- make install
- 配置环境变量
(1)export FFMPEG=/usr/local/aiconfiguration/ffmpeg/bin
(2)export PATH=$PATH:$FFMPEG - 通过ffmpeg -version查看版本信息
方式二
yum install ffmpeg
查看ffmpeg详细信息:yum info ffmpeg(版本比较旧2.6)
-
方式二安装遇到的问题:
yum install ffmpeg没有匹配 ffmpeg 的软件包(1)安装EPEL Release
- 因为安装需要使用其他的repo源,所以需要EPEL支持yum install -y epel-release
- 如果出现缺少Code提示,通过rpm --import /etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7安装
- 查看是否安装成功 yum repolist
(2)安装Nux-Dextop源
- 导入一个Code
sudo rpm --import http://li.nux.ro/download/nux/RPM-GPG-KEY-nux.ro - 安装nux-dextop 源
sudo rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-1.el7.nux.noarch.rpm - 查看repo源是否安装成功
yum repolist
卸载tensorflow
- 通过源码安装的卸载方法
- pip uninstall tensorflow
- 把tensorflow的源码文件夹移除
卸载numpy、scipy、Pillow
pip uninstall numpy || pip uninstall scipy || pip uninstall pillow
卸载bazel
- bazel通过源码安装的卸载方法
- 把bazel的源码文件夹移除
- 把在/etc/profile里设置的环境变量移除,保存之后通过source /etc/profile生效
卸载ffmpeg
- 移除源码安装的文件夹(路径根据./configure --prefix=/usr [设置安装路径]这个设置)
- 移除ffmpeg解压出来的文件夹