调试pytorch版本,而不是调试keras版本的Retinanet。在github上有两个pytorch版本的retinnet,除了Facebook和旷世的mmdet。
yhenon和kuangliu。yhenon版本需要pytorch为0.4,kuangliu为1.0.我是1.0版本的pytorch,所以选择了后者。
但是后者什么说明也没提供,好在代码简单。
我的类别是42类一个数据,所以我需重新做数据集,也要改一些东西
文件如下:
1 首先得到预训练参数
- 先下载rene-50的数据:
wget https://download.pytorch.org/models/resnet50-19c8e357.pth
把resnet50-19c8e357.p放到module目录下面,这时,需要运行scripts下面的get_state_dict.py文件得到retinanet的初始化参数,这个参数中包括resnet50-19c8e357.pth和新加入的随机参数。
重要
需要修改get_state_dict.py中第26行的num_classess=42对应自己的数据中类别数量。
执行下面,就会得到net.pth。
python get_state_dict.py
2. 生成自己的数据集
- 参考这里voc_annotation.py的生成方式。我的原始数据为voc格式。
import xml.etree.ElementTree as ET
from os import getcwd
sets=[('VOC_TT_512_test', 'train'), ('VOC_TT_512_test', 'val'), ('VOC_TT_512_test', 'trainval')]
type42="i2,i4,i5,il100,il60,il80,ip,p10,p11,p12,p19,p23,p26,p27,p3,p5,p6,pg,ph4,ph4.5,ph5,pl100,pl120,pl20,pl30,pl40,pl5,pl50,pl60,pl70,pl80,pm20,pm30,pm55,pn,pne,pr40,w13,w32,w55,w57,w59"
classes = type42.split(',')
def convert_annotation(year, image_id, list_file):
in_file = open('%s/Annotations/%s.xml'%(year, image_id))
tree=ET.parse(in_file)
root = tree.getroot()
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult)==1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))
list_file.write("," + ",".join([str(a) for a in b]) + ',' + str(cls_id))
wd = getcwd()
for year, image_set in sets:
image_ids = open('%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('csv%s_%s.txt'%(year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/%s/JPEGImages/%s.jpg'%(wd, year, image_id))
convert_annotation(year, image_id, list_file)
list_file.write('\n')
list_file.close()
3 修改train参数
修改trainset和testset。
4 修改loss
训练的时候发现,loc_loss: 0.000 | cls_loss: 1.000
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.619 | avg_loss: 1.619
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.631 | avg_loss: 1.625
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.695 | avg_loss: 1.648
loc_loss: 0.000 | cls_loss: 2.000 | train_loss: 2.269 | avg_loss: 1.803
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.789 | avg_loss: 1.801
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.978 | avg_loss: 1.830
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.628 | avg_loss: 1.801
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.904 | avg_loss: 1.814
loc_loss: 0.000 | cls_loss: 1.000 | train_loss: 1.652 | avg_loss: 1.796
更新
由于上面的代码效果不行,就弄了新的代码yhenon/pytorch-retinanet,这个代码支持pytorch-0.4.1,为了支持pytorch-1.0需要进行修改。
在编译nms的时候,这里的nms使用pytorch-0.4的api==(torch.utils.ffi),但是在1.0中已经被删掉了。
所以直接编译nms会报错:
ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead
参考如下方法支持pytorch-1.0:
- 下载这里的NMS
- 把nms复制到pytorch-retinanet的lib/nms下面
- cd nms ,rm -rf /build,rm *.so
- cd ..,python setup3.py build_ext --inplace
现在就可以继续训练了
怎么制作训练数据
把voc数据变成csv
import xml.etree.ElementTree as ET
from os import getcwd
sets=[('VOC_512_test', 'val'), ('VOC_512_test', 'trainval')]
type="cow,cat,bird"
classes = type.split(',')
with open("class_name.csv","w") as f:
for id,name in enumerate(classes):
f.writelines(name + ',' + str(id) + '\n')
def convert_annotation(year, image_id, list_file):
in_file = open('%s/Annotations/%s.xml'%(year, image_id))
tree=ET.parse(in_file)
root = tree.getroot()
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult)==1:
continue
# cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
list_file.write('%s/%s/JPEGImages/%s.jpg' % (wd, year, image_id))
b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))
list_file.write("," + ",".join([str(a) for a in b]) + ',' + str(cls))
list_file.write('\n')
wd = getcwd()
for year, image_set in sets:
image_ids = open('%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('%s_%s.csv'%(year, image_set), 'w')
for image_id in image_ids:
# list_file.write('%s/%s/JPEGImages/%s.jpg'%(wd, year, image_id))
convert_annotation(year, image_id, list_file)
# list_file.write('\n')
list_file.close()