文字检测与识别

检测

Text instance level:

Anchor-based methods

EAST

Region proposal methods

R2CNN

Componets-level:

SegLink , Corner localization, CPTN

Pixel-level:

PixelLink

Multi-Oriented Text:

Instance Transformation Network (ITN)

Text of Irregular Shapes:

TextSnake

识别

CTC-based Methods

(a) CNN + softmax. (b) RNN + CTC. (c) RNN +Attention. (d) CNN + CTC

Attention-based methods

FAN
(perspectively distorted or curved)#STN+attention-based Sequence Recognition Network#:The STN predict a Thin-Plate-Spline transformations which rectify the input irregular text image into a more canonical form.
(perspectively distorted or curved)#four feature sequences of four directions#:horizontal, reversed horizontal, vertical and reversed vertical. And a weighting mechanism is designed to combine the four feature sequences.
(perspectively distorted or curved)#alignment loss to regularize the estimated attention at each time-step. Further, they use a coordinate map as a second input to enforce spatial-awareness.
HAM can handle different types of distortion

None End-to-End System

SSD+CRNN

End-to-End

Faster-RCNN + an encoder-decoder based text recognition model
EAST/YOLO2 branch -> text proposals -> map to CTC-based methods