运行tensorflow
在CUDA完成安装之后,还需要添加环境变量,打开终端,输入下面的命令:
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
如果是64位系统,输入:
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
如果是32位系统,输入:
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
如果需要使用其中的内置库
export PYTHONPATH=$PYTHONPATH:/home/time/ImageNet/models-master
运行ResNet
ResNet的程序位于offical/resnet目录下
假设ImageNet存放目录为
/media/time/20162AC5162A9BB2/Thunder/ImageNet_TF
运行
python imagenet_main.py --data_dir='/media/time/20162AC5162A9BB2/Thunder/ImageNet_TF' --batch_size=16 --model_dir='./model_101Res/' --resnet_size=101
可以将上面的文件写成批处理文件
export PYTHONPATH=$PYTHONPATH:/home/time/ImageNet/models-master
export PATH=/usr/local/cuda-9.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
python imagenet_main.py --data_dir='/media/time/20162AC5162A9BB2/Thunder/ImageNet_TF' --batch_size=256 --model_dir='./modelChkPt2/' --resnet_size=18
注意这里Image_main.py的参数
flags:
imagenet_main.py:
-bs,--batch_size:
Batch size for training and evaluation. When using multiple gpus, this is
the
global batch size for all devices. For example, if the batch size is 32 and
there are 4 GPUs, each GPU will get 8 examples on each step.
(default: '32')
(an integer)
--[no]clean:
If set, model_dir will be removed if it exists.
(default: 'false')
-dd,--data_dir:
The location of the input data.
(default: '/tmp')
-df,--data_format: <channels_first|channels_last>:
A flag to override the data format used in the model. channels_first
provides a
performance boost on GPU but is not always compatible with CPU. If left
unspecified, the data format will be chosen automatically based on whether
TensorFlow was built for CPU or GPU.
-ebe,--epochs_between_evals:
The number of training epochs to run between evaluations.
(default: '1')
(an integer)
-ed,--export_dir:
If set, a SavedModel serialization of the model will be exported to this
directory at the end of training. See the README for more details and
relevant
links.
-hk,--hooks:
A list of (case insensitive) strings to specify the names of training hooks.
Hook:
profilerhook
loggingtensorhook
examplespersecondhook
loggingmetrichook
Example: `--hooks ProfilerHook,ExamplesPerSecondHook`
See official.utils.logs.hooks_helper for details.
(default: 'LoggingTensorHook')
(a comma separated list)
-md,--model_dir:
The location of the model checkpoint files.
(default: '/tmp')
-rs,--resnet_size: <18|34|50|101|152|200>:
The size of the ResNet model to use.
(default: '50')
-rv,--resnet_version: <1|2>:
Version of ResNet. (1 or 2) See README.md for details.
(default: '2')
-te,--train_epochs:
The number of epochs used to train.
(default: '100')
(an integer)
使用Tensorboard
tensorboard --logdir=/home/time/ImageNet/models-master/official/resnet/model_101Res
可以启动tensorboard观察运行状态