Training Your Own Faster RCNN

Dependecies installation

pip install scipy pillow matplotlib pyyaml easydict opencv-python

This repo

https://github.com/smallcorgi/Faster-RCNN_TF

allows us to train our own Faster-RCNN. To train the network, follow the instructions in the ReadME file of the repo above until you are able to train the VOC dataset.

Over that process, you may encounter the following problems.

  1. When you execute

     python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
    

    you might get following error codes

     tensorflow.python.framework.errors_impl.NotFoundError: /home/neno/workspace/OCR/Faster-      
     RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol:_ZTIN10tensorflow8OpKernelE
    

    To solve this problem, replace $REPO/lib/make.sh with the following content and run

     *python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt*
    

    again. (P.S. There is no need to execute make again after you have modified the make.sh)

     #!/usr/bin/env bash
     TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
     TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include()       )')
     echo $TF_INC
    
     CUDA_PATH=/usr/local/cuda/
    
     cd roi_pooling_layer
    
     nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
         -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
    
     ## if you install tf using already-built binary, or gcc version 4.x,        uncomment the two lines below
     #g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so        roi_pooling_op.cc \
     #   roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
    
     # for gcc5-built tf
     g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc      -D_GLIBCXX_USE_CXX11_ABI=0 \
         roi_pooling_op.cu.o -I $TF_INC -L $TF_LIB -ltensorflow_framework -D         GOOGLE_CUDA=1 \
         -fPIC $CXXFLAGS -lcudart -L $CUDA_PATH/lib64
    
     cd ..
    
    
     # add building psroi_pooling layer
     cd psroi_pooling_layer
     nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \
         -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
    
     g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \
         psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
    
     ## if you install tf using already-built binary, or gcc version 4.x,        uncomment the two lines below
     #g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so      psroi_pooling_op.cc \
     #   psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
    
     cd ..
    


  1. When you execute

     python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
    

    you might get the following error message

     tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7104 
     (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000).  If 
     using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure 
     the library loaded at runtime matches a compatible version specified during compile configuration.
    

    To solve this problem, uninstall the cuDNN 7.1.4 and install cuDNN 7.0.5 instead.


Now, we are going to build our own dataset. First of all, you've got to prepare the images you want to rain and then we make them look "nice" so that our network can be trained effciently.

The author of this blog provides us three scripts to resize, change filename and generate index files. Here is a merged and improved version I wrote on the basis of the provided scripts . The script below firstly resizes the images and save them to the output directory and then uniform the file format to jpg and finally rename them into VOC style.

import cv2
import os
import sys
from PIL import Image
import re

path = sys.argv[1]
output_dir = sys.argv[2]

if not os.path.exists(output_dir):
    os.mkdir(output_dir)

ticket_width = 300

print('resizing raw images...')

list=os.listdir(path)
output_dir_image = output_dir + '/images/'
output_dir_label = output_dir + '/labels/'
output_dir_index = output_dir + '/indexs/'
if not os.path.exists(output_dir_image):
    os.mkdir(output_dir_image)
if not os.path.exists(output_dir_label):
    os.mkdir(output_dir_label)
if not os.path.exists(output_dir_index):
    os.mkdir(output_dir_index)
count=0

for pic in list:
    im = cv2.imread(path + '/' + pic)
    h = im.shape[0]
    w = im.shape[1]
    ratio = float(ticket_width) / w
    w_new = ticket_width
    h_new = int(ratio * h)
    im = cv2.resize(im, (w_new, h_new))
    new_path=output_dir_image + '/' + pic[0:-3] + 'jpg'
    cv2.imwrite(new_path, im)

print("renaming...")

filelist = os.listdir(output_dir_image)
total_num = len(filelist)
i = 10000 
n = 6
for item in filelist:
    if item.endswith('.jpg'):
        n = 6 - len(str(i))
        src = os.path.join(os.path.abspath(output_dir_image), item)
        dst = os.path.join(os.path.abspath(output_dir_image), str(0)*n + str(i) + '.jpg')
        try:
            os.rename(src, dst)
            i = i + 1
        except:
            continue

print('finished')

It takes two arguments to run.

python script.py $IMAGES_DIR $OUTPUT_DIR

Well done. Now you have completed all the operations that you need to do with the raw images. Here comes a heavier task ---- labeling the images.This tool provides utilities but it still costs much time. WHEN LABELING, PLEASE USE LOWER-CASE LETTERS FOR ALL LABELS. UPPER-CASE LABELS WOULD LEAD TO PROGRAM ERRORS.

The last step is to produce txt index files. This script could automatically generate index files for us.

# !/usr/bin/python
# -*- coding: utf-8 -*-
import os
import random  
import sys  

trainval_percent = 0.8  # tunable parameter
train_percent = 0.7  # tunable parameter
xmlfilepath = sys.argv[1]
txtsavepath = sys.argv[2]  
total_xml = os.listdir(xmlfilepath)  
  
num=len(total_xml)  
list=range(num)  
tv=int(num*trainval_percent)  
tr=int(tv*train_percent)  
trainval= random.sample(list,tv)  
train=random.sample(trainval,tr)  
  
ftrainval = open(txtsavepath + '/trainval.txt', 'a')  
ftest = open(txtsavepath + '/test.txt', 'a')  
ftrain = open(txtsavepath + '/train.txt', 'a')  
fval = open(txtsavepath + '/val.txt', 'a')  
  
for i  in list:  
    name=total_xml[i][:-4]+'\n'  
    if i in trainval:  
        ftrainval.write(name)  
        if i in train:  
            ftrain.write(name)  
        else:  
            fval.write(name)  
    else:  
        ftest.write(name)  
  
ftrainval.close()  
ftrain.close()  
fval.close()  
ftest .close()  

To use it, you will need to input the directory where labels store at and the directory where index files should be placed at.

python script.py $LABEL_DIR $OUTPUT_DIR

And you would get the following four text files

$OUTPUT_DIR/trainvel.txt
$OUTPUT_DIR/text.txt
$OUTPUT_DIR/train.txt
$OUTPUT_DIR/val.txt

These files will guide the neural network to locate the dataset.


If you have reached this line, you are already very close to start training your own FRCNN. Now we are going to replace VOC dataset our prepared data.

Put VOC images and labels into trash bin

rm -rf $VOC_DATA_DIR/VOCdevkit/VOC2007/JPEGImages/*
rm -rf $VOC_DATA_DIR/VOCdevkit/VOC2007/Annotations/*

and bring our data under the spotlight.

cp $IMAGE_DIR/* $VOC_DATA_DIR/VOCdevkit/VOC2007/JPEGImages/*
cp $LABEL_DIR/* $VOC_DATA_DIR/VOCdevkit/VOC2007/Annotations/*
cp $INDEX_DIR/* $VOC_DATA_DIR/VOCdevkit/VOC2007/ImageSets/Main/

Now the very last step is to modify the source code. Four changes have to be made.

  1. $FRCNN_DIR\lib\datasets\pascal_voc.py

    Find variable _classes

     self._classes = ('__background__', # always index 0
                      'aeroplane', 'bicycle', 'bird', 'boat',
                      'bottle', 'bus', 'car', 'cat', 'chair',
                      'cow', 'diningtable', 'dog', 'horse',
                      'motorbike', 'person', 'pottedplant',
                      'sheep', 'sofa', 'train', 'tvmonitor')
    

    and append your own classes at the tail.

     self._classes = ('__background__', # always index 0
                      'aeroplane', 'bicycle', 'bird', 'boat',
                      'bottle', 'bus', 'car', 'cat', 'chair',
                      'cow', 'diningtable', 'dog', 'horse',
                      'motorbike', 'person', 'pottedplant',
                      'sheep', 'sofa', 'train', 'tvmonitor', 'class1', 'class2')
    
  2. $FRCNN_DIR\lib\networks\VGGnet_train.py

    This line indicates the totoal num of all classes

     n_classes = 21
    

    If your own data has n classes to recognize, increase its value by n.

     # Two classes to recognize for example
     n_classes = 23
    
  3. $FRCNN_DIR\lib\networks\VGGnet_test.py

    Same as 2

  4. $FRCNN_DIR\tools\demo.py

    Find variable CLASSES and append your own classes at the tail the way same as modifying _classes in pascal_voc.py



Now, we are ready to train. Use this command to start training

./experiments/scripts/faster_rcnn_end2end.sh gpu 0 VGG16 pascal_voc

Here are the two errors I encountered :

  1.  Traceback (most recent call last):
       File "./tools/train_net.py", line 83, in <module>
         roidb = get_training_roidb(imdb)
       File "/home/yinqsh/Ningyuan/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 204, in                
       get_training_roidb
         imdb.append_flipped_images()
       File "/home/yinqsh/Ningyuan/Faster-RCNN_TF/tools/../lib/datasets/imdb.py", line 113, in       
         append_flipped_images
        assert (boxes[:, 2] >= boxes[:, 0]).all()
    

I googled it and it comes out that this error is probably caused by some illeagal boundboxes. The boundaries of these boundboxes exceed the image boundaries and therefore lead up to crashes. One solution is to delete all cache files avoiding models mix-up.

rm $FRCNN/output
rm $FRCNN/data/cache
rm $FRCNN/VOCdevkit2007/annotations_chache // if this directory exists

Another solution is to modify append_flipped_images() method in $FRCNN/lib/datasets/imdb.py. Find this line of code

boxes[:, 2] = widths[i] - oldx1 - 1

and add the following lines of code right below

boxes[:, 2] = widths[i] - oldx1 - 1
# ------------------TO-ADD-PART------------------
for b in range(len(boxes)):
    if boxes[b][2]< boxes[b][0]:
        boxes[b][0] = 0
# ------------------TO-ADD-PART------------------
  1. KeyError: 'max_overlaps'
    

Solution : Delete caches

rm $FRCNN/output
rm $FRCNN/data/cache
rm $FRCNN/VOCdevkit2007/annotations_chache // if this directory exists

After training, you would get a trained model in

$FRCNN_DIR/output/aster_rcnn_end2end/voc_2007_trainval/

By default, model would be saved every 5000 iterations. We are going to use the model that was trained with most iterations. And there are three files for that model

model.ckpt.meta
model.ckpt.data-00000-xx-00000
model.ckpt.index

Make a copy of model.ckpt.data file under the same folder and remove suffix .data-00000-xx-00000 from the name of that copy.

Finally, we are ready to test the power of FRCNN. Put some test images into

$FRCNN/data/demo

and run

python $FRCNN/tools/demo.py --model $FRCNN/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_10000.ckpt


BANG!BANG!BNAG!



Bugs encountered when moving codes from python2 to python 3

  1. When you execute

     python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
    

    you might get the following error message

    ImportError: /media/neno/44B0AB27B0AB1F04/Faster RCNN/Faster-   
    RCNN_TF/tools/../lib/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct
    

    To solve this problem, go to

     $FRCNN_DIR/lib
    

    and run

     python3 setup.py build_ext --inplace
    
  2. When you execute

     python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
    

    you might get the following error message

    ImportError: No module named 'cPickle'
    

    To solve this problem, change cPickle to pickle.

  3. After upgrade pip3, pip3 crashes.

     Traceback (most recent call last):
       File "/usr/bin/pip3", line 9, in <module>
         from pip import main
     ImportError: cannot import name 'main'
    

To solve this problem, open /usr/bin/pip3 and change the following codes

    from pip import __main__
        if __name__ == '__main__':
        sys.exit(__main__._main())

to

    from pip import __main__
        if __name__ == '__main__':
        sys.exit(__main__._main())
  1. When you tried to run $FRCNN/lib/make.sh, you get the following error

     fatal error: nsync_cv.h: No such file or directory
     #include "nsync_cv.h"
    

To solve this problem, open the file which causes this error and change the following two lines

  #include "external/nsync/public/nsync_cv.h"
  #include "external/nsync/public/nsync_mu.h"
  1. When you tried to run demo.py, you get the following error

     cudaCheckError() : no kernel image is available for execution on the device
    

To solve this problem, go to $FRCNN/lib/make.sh file and check extra options for the nvcc compiler. There should be a option called arch which specifies the computation architecture of the Nvidia card. Check the compute ability of your card and change the arch option accordingly. In my case, I am using Telsa K80 which has 3.7 compute ability. I used to compile with -arch=sm_52 which caused this error. Then I changed it to -arch=sm_35 and things go really well now.


P.S.

By default, the network would load parameters from pre-trained VGG16 model. However, it might not perform well on the testset in practice. Instead, training from scratch gives a relatively better prediction.


References:

https://blog.csdn.net/zcy0xy/article/details/79614862


Follw My Wechat Official Account
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 219,589评论 6 508
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 93,615评论 3 396
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 165,933评论 0 356
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,976评论 1 295
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,999评论 6 393
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,775评论 1 307
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,474评论 3 420
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,359评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,854评论 1 317
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 38,007评论 3 338
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 40,146评论 1 351
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,826评论 5 346
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,484评论 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 32,029评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 33,153评论 1 272
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,420评论 3 373
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 45,107评论 2 356