Faster R-CNN 目标检测算法
Towards Real-Time Object Detection with Region Proposal Networks
R-CNN:Regions with CNN features
- Input image
- Extract region proposals(~2k)
- Compute CNN features
- Classify regions
IoU Intersection over Union
测量在特定数据集中检测相应物体准确度的一个标准
预测范围: bounding boxex
ground-truth bounding boxes(人为在训练集图像中标出要检测物体的大概范围)
NMS (Non-Maximum Suppression)
Fast R-CNN
Selection search
Anchor sliding window Feature extraction
RPN Loss
Cls label 二分类,是否有物体,使用IoU gt bounding box anchor box
Loc label
Cls loss
Cross Entropy交叉熵
Loc Loss
RoI Head Region of Interest
Mask R-CNN
To this we apply a per-pixel sigmoid,and define as the average binary cross-entropy loss. For an RoI associated with gorund-truth k, is only defined o the k-th mask(other mask outputs do not contribute to the loss).
RoI Align不对齐,保留浮点,在小区域之内继续划分
CTPN 文字检测算法
Detecting Text in Natural Image with Connectionist Text Proposal Network
- Detecting text in fine-scale proposals
- Recurrent connectionist text proposals
- Side-refinement
Text line construction
Code
bounding box
CRNN 文字识别算法
An End-yo-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
- CRNN
- Code
- CTC
- lexicon-based
- lexicon-free
feature sequence —— receptive field感受野
CRNN——CTC
CTC Theory
为了让所有的path都能在图中唯一、合法的表示,结点转换有如下约束:
- 转换只能往右下方向,其他方向不允许
- 相同的字符之间起码要有一个空字符
- 非空字符不能被跳过
- 起点必须从前两个字符开始
- 终点必须落在结尾两个字符
forward-backward
定义在时刻t经过节点s的全部前缀子路径的概率总和为前向概率
情况1:第s个符号为空符号blank
情况2:第s个符号等于第s-2个符号
-
情况3:既不属于情况1,也不属于情况2
不属于情况2