Applicaton example: Photo OCR
Problem description and pipeline
Photo OCR pipeline:
- Text detection
- Character segmentation
- Character classification
Sliding windows
Text detection:
sliding window detection:
- step-size/stride
Character segmentation:
1D Sliding window for character segmentation
Getting lots of data: Artificial data synthesis
- Create new data.
- Synthesizing data by introducing distortions
- Distortion introduced should be representation of the type of noise/distortions in the test set.
- Usually does not help to add purely random/meaningless noise to your data.
Ceiling analysis: What part of the pipeline to work on next
Estimating the errors due to each compoent
What part of the pipeline should you spend the most time trying to improve?