1.数据集标注
2.lmdb数据集制作
a.采用 weiliu89中的./data/VOC0712/create_list.sh和./data/VOC0712/create_data.sh
脚本制作数据集,工程位于本机ssd/caffe中,以下几个文件需要根据个人情况修改。
scripts/create_annoset.py examples/ssd/ssd_pascal.py examples/ssd/score_ssd_pascal.py
指定caffe安装路径
caffe_root = '/home/ljg/ssd/caffe'
import os
os.chdir(caffe_root)
import sys
sys.path.insert(0, 'python')
sys.path.insert(0,caffe_root+'python')
把指定gpu训练的注释掉
还有一些设置可以参考以下博客
Caffe上用SSD训练和测试自己的数据
SSD框架训练自己的数据集
在$CAFFE_ROOT目录下分别运行:
./data/ljy_test/create_list_indoor.sh
./data/ljy_test/create_data_indoor.sh
3.模型训练
chuanqi305/MobileNet-SSD
基于自制数据集的MobileNet-SSD模型训练
按照训练步骤训练
4.移植到移动端
Ncnn使用详解(1)——PC端
使用ncnn部署到ios手机端
android ios 预编译库 20180129 f133729
我使用的是这个ncnn库文件
之前的版本对于训练的模型有些层不支持,之后的一个版本对同样的图片输入结果不一样,应该是存在bug
dangbo/ncnn-mobile
这是我使用的demoios工程,我在他的基础上进行啦修改,替换ncnn库和相应的头文件,然后还要修改为支持视频实时检测
AVCaptureConnection* videoConnection = [videoDataOutput connectionWithMediaType:AVMediaTypeVideo];
[videoConnection setVideoOrientation:AVCaptureVideoOrientationPortrait];
我这个是采用前置摄像头拍摄,要如此设置,否则图片是横着的,检测出错
Object object;
object.class_id = values[0];
object.prob = values[1];
std::string label = std::string(class_names[object.class_id]);
object.lable = [NSString stringWithUTF8String:label.c_str()];
object.rec.x =(1 - values[2]) * screenW;
object.rec.y = values[3] * screenH;
object.rec.width = (1-values[4]) * screenW- object.rec.x;
object.rec.height = values[5] * screenH - object.rec.y;
objects.push_back(object);
要实时在手机屏幕显示标注框,如上修改
- (void)captureOutput:(AVCaptureOutput*)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
fromConnection:(AVCaptureConnection*)connection
{
@autoreleasepool {
CFRetain(sampleBuffer);
UIImage *image =[self imageFromSampleBuffer:sampleBuffer];
CFRelease(sampleBuffer);
[self predictFrameImage:image];
}
}
我把sampleBuffer先转化为image,然后再预测,这是取巧的方法,并不是合理的方法
- (UIImage *) imageFromSampleBuffer:(CMSampleBufferRef) sampleBuffer
{
// Get a CMSampleBuffer's Core Video image buffer for the media data
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
// Lock the base address of the pixel buffer
CVPixelBufferLockBaseAddress(imageBuffer, 0);
// Get the number of bytes per row for the pixel buffer
void *baseAddress = CVPixelBufferGetBaseAddress(imageBuffer);
// Get the number of bytes per row for the pixel buffer
size_t bytesPerRow = CVPixelBufferGetBytesPerRow(imageBuffer);
// Get the pixel buffer width and height
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
// Create a device-dependent RGB color space
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
// Create a bitmap graphics context with the sample buffer data
CGContextRef context = CGBitmapContextCreate(baseAddress, width, height, 8,
bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
// Create a Quartz image from the pixel data in the bitmap graphics context
CGImageRef quartzImage = CGBitmapContextCreateImage(context);
// Unlock the pixel buffer
CVPixelBufferUnlockBaseAddress(imageBuffer,0);
// Free up the context and color space
CGContextRelease(context);
CGColorSpaceRelease(colorSpace);
// Create an image object from the Quartz image
//UIImage *image = [UIImage imageWithCGImage:quartzImage];
UIImage *image = [UIImage imageWithCGImage:quartzImage scale:1.0f orientation:UIImageOrientationRight];
// Release the Quartz image
CGImageRelease(quartzImage);
return (image);
}