刚好用了swin,做分类和目标检测,也来试试分割的,而且很多场景分割更有效果,比如积水识别,安全带,土坑裂缝等等
案例来自比赛
https://www.dcic-china.com/competitions/10021
这回真是小试牛刀了,因为是智慧农业赛题——牛只图像分割竞赛
数据任务
以牛只实例分割图像数据作为训练样本,参赛选手需基于训练样本构建模型,对提供的测试集中的牛只图像进行实例分割检测。方法不限于实例分割,
目标检测是识别图像中存在的内容和检测其位置,
语义分割是对图像中的每个像素打上类别标签,实例分割其实是目标检测和语义分割的结合,在图像中将目标检测出来(目标检测),然后对每个像素打上标签(语义分割)。 语义分割不区分属于相同类别的不同实例(所有人都标为红色),实例分割区分同类的不同实例(使用不同颜色区分不同的人)。所以题目严格意义是语义分割,但是要标出每只牛。
看下数据就是牛棚里的,标签是多边形 polygon,标的还是一般,数据有点模糊,而且角度是俯视,用coco的cow恐怕差太多,图片比较少训练200张,牛2千多,测试100张;
可以通过标注软件查看和修改标注,比如cvat,不过这个是网页版
这里就先用swin-transformer-object-detection跑个baseline吧
https://github.com/SwinTransformer/Swin-Transformer-Object-Detection
数据处理
环境部署
部署swin环境同目标检测,这里用paddlex来处理数据集,所以部署环境
conda create -n paddlex python=3.7
conda activate paddlex
pip install cython
git clone https://github.com/philferriere/cocoapi.git
cd .\cocoapi\PythonAPI
python3 setup.py build_ext install
pip install paddlepaddle -i https://mirror.baidu.com/pypi/simple
pip install paddlex -i https://mirror.baidu.com/pypi/simpleor
git clone https://github.com/PaddlePaddle/PaddleX.git
cd PaddleX
git checkout develop
python setup.py install
环境搭建好了,
现在将数据集格式为
划分数据集,也可以省略测试集
paddlex --split_dataset --format COCO --dataset_dir 200 --val_value 0.2 --test_value 0.1
将图片转移到训练和验证文件夹
# -*- coding: utf-8 -*-
"""
Created on Wed Mar 29 09:20:40 2017
@author: yiyi
"""
import json
import os
import os, random, shutil
from shutil import copy2
path='E:/workspace/Swin-Transformer-Object-Detection/data/cow/200/'
valDir = 'E:/workspace/Swin-Transformer-Object-Detection/data/cow/200/val2017/'
trainDir = 'E:/workspace/Swin-Transformer-Object-Detection/data/cow/200/train2017/'
fp=open('./200/train.json','r')
data=json.load(fp)
images = []
for fi in data['images'] :
images.append(fi['file_name'])
print(images)
fp.close()
for v in images:
(file, filename) = os.path.split(v)
shutil.copy(path + v, trainDir + filename ) #, follow_symlinks=False)
fp=open('./200/val.json','r')
data=json.load(fp)
images = []
for fi in data['images'] :
images.append(fi['file_name'])
print(images)
fp.close()
for v in images:
(file, filename) = os.path.split(v)
shutil.copy(path + v, valDir + filename ) #, follow_symlinks=False)
模型配置
处理好图片就开始配置swinT了,
修改配置
修改configs\_base_\models\mask_rcnn_swin_fpn.py中num_classes
两个地方 改为具体类别数 设为1
修改configs\_base_\default_runtime.py中interval,load_from
root@k8s-master1:/media/nizhengqi/7a646073-10bf-41e4-93b5-4b89df793ff8/wyh/Swin-Transformer-Object-Detection# cat configs/_base_/default_runtime.py
checkpoint_config = dict(interval=4)
# yapf:disable
log_config = dict(
interval=40,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
custom_hooks = [dict(type='NumClassCheckHook')]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = "mask_rcnn_swin_1.pth"
resume_from = None#"mask_rcnn_swin_1.pth"
workflow = [('train', 1),('val',1)]
修改权重文件
主要修改类别为自己的类别数 cat changeclass.py
import torch
model_save_dir = "./"
pretrained_weights = torch.load('mask_rcnn_swin_tiny_patch4_window7.pth')
num_class = 1 #实际类别数
pretrained_weights['state_dict']['roi_head.bbox_head.fc_cls.weight'].resize_(num_class + 1, 1024)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_cls.bias'].resize_(num_class + 1)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_reg.weight'].resize_(num_class * 4, 1024)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_reg.bias'].resize_(num_class * 4)
pretrained_weights['state_dict']['roi_head.mask_head.conv_logits.weight'].resize_(num_class, 256, 1, 1)
pretrained_weights['state_dict']['roi_head.mask_head.conv_logits.bias'].resize_(num_class)
torch.save(pretrained_weights, "{}/mask_rcnn_swin_{}.pth".format(model_save_dir, num_class))
相应修改configs/base/datasets/coco_instance.py中数据集路径
dataset_type = 'CocoDataset'
data_root = 'data/cow/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=1,
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json',
img_prefix=data_root + 'train2017/',
pipeline=train_pipeline),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'val2017/',
pipeline=test_pipeline))
evaluation = dict(metric=['bbox', 'segm'])
[object Object][object Object]
修改模型权重参数等
修改configs\swin\mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py中的max_epochs、lr
参数文件改为coco_instance的
_base_ = [
'../\_base\_/models/mask_rcnn_swin_fpn.py',
'../\_base\_/datasets/coco_instance.py',
'../\_base\_/schedules/schedule_1x.py', '../\_base\_/default_runtime.py'
]
lr_config = dict(step=[27, 33])
runner = dict(type='EpochBasedRunnerAmp', max_epochs=40)
# do not use mmdet version fp16
fp16 = None
optimizer_config = dict(
type="DistOptimizerHook",
update_interval=1,
grad_clip=None,
coalesce=True,
bucket_size_mb=-1,
use_fp16=True,
)
修改mmdet/core/evalution/class_names.py和mmdet/datasets/coco.py中的标签
def coco_classes():
return ['cow']
class CocoDataset(CustomDataset):
CLASSES = ('cow',)
注意一个类也要逗号,还要进行编译 python setup.py install
训练
python tools/train.py configs/swin/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py
推理结果
root@c92561fab718:/workspace# cat infer.py
from mmdet.apis import init_detector
from mmdet.apis import inference_detector
import torch
import os
import json
from PIL import Image
from mmdet.apis import show_result_pyplot
from mmdet.core.mask.utils import encode_mask_results
import numpy as np
# 模型配置文件
config_file = './work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py'
# 预训练模型文件
checkpoint_file = './work_dirs/mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco/epoch_40.pth'
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("[INFO] 当前使用{}做推断".format(device))
# 通过模型配置文件与预训练文件构建模型
model = init_detector(config_file, checkpoint_file, device=device)
# 测试单张图片并进行展示
#img = './data/wechat/val2017/wechat_20210726_0792.jpg'
# img = './nms_test.jpg'
json_filepath = './results.json'
json_file = open(json_filepath,mode='w',encoding='utf-8')
files = os.listdir("data/cow/images")
count = 0
waste = 0
results = []
for file in files:
#img_path = os.path.join("data/safehat/val2017",file)
#img = Image.open(img_path)
#if img.mode == 'RGBA':
# r, g, b, a = im.split()
# img = Image.merge("RGB", (r, g, b))
img = os.path.join("data/cow/images",file)
outfile = os.path.join('result/',file)
result = inference_detector(model, img)
#for bbox_results, mask_results in result:
print('images/'+file)
#print(encode_mask_results(result[1]))
print(len(result[0][0]))
print(len(result[1][0]))
mask = encode_mask_results(result[1])
imageid = "images/"+file
for i in range(len(result[1][0])):
masksize = mask[0][i]["size"]
maskcount = ""+mask[0][i]["counts"].decode()
score = np.round(np.float(result[0][0][i][4]),3)
resdict = { "image_id": imageid ,"category_id": 1 ,
"segmentation": { "size": masksize, "counts": maskcount },
"score": score }
#resdict = str(resdict).replace("'","\"")
results.append(resdict)
#print(res)
#print(results)
#res = str(results).replace('\'','"').replace(r"\n","")
json.dump(results,json_file,ensure_ascii=False,indent=4)
这里提交后只有0.6多,后面还需要调优参数,增强数据,或者试试swin 分割 ,swin-Unet之类。