Vision Framework - Building on Core ML

What You Can Do with Vision

Face Detection
- Small Faces
- Strong Profiles
- Partially Occluded
- Hats and Glasses
- Face Landmarks
Image Registration
Rectangle Detection
Barcode Detection
Text Detection
Object Tracking
- faces
- rectangles
- general templates
Integration with Core ML

On Device vs. Cloud

Privacy
- Images and video stay on device
Cost
- No usage fees
- No data transfer
Real-time use cases
- No latency, fast execution

Vision Concepts

Analyzing an Image

2dd7f001-33d3-4fc8-860a-ecbff3b0967c.png

36d6f7e2-e797-49fd-badf-39c9927cf1e8.png
Tracking in a Sequence

d9d9bd9a-c67a-4fa4-97c1-4dcc51ef5be4.png
Image Request Handler
- For interactive exploration of an image
- Holds on to the image for its lifecycle
- Allows optimization of various requests performed on an image
Sequence Request Handler
- For anything that looks at images in a sequence like tracking
- Does not optimize for multiple requests on an image
Putting It into Code

// Create request
let faceDetectionRequest = VNDetectFaceRectanglesRequest()

 // Create request handle
let myRequestHandler = (url: fileURL, options: [:]) 

// send the requests to the request handler
myRequestHandler.perform([faceDetectionRequest])

// Do we have a face
for observation in faceDetectionRequest.results as! [VNFaceObservation] { /// do something
}

 // Create a sequence request handler
let requestHandler = VNSequenceRequestHandler()

// Start the tracking with an observation
let observations = detectionRequest.results  as! [VNDetectedObjectObservation]
let objectsToTrack = observations.map { VNTrackObjectRequest(detectedObjectObservation: $0) }

// Run the requests
requestHandler.perform(objectsToTrack, on: pixelBuffer)

// Lets look at the results
for request in objectsToTrack
for observation in request.results as! [VNDetectedObjectObservation]

Which Image Type Is Right for Me?

Vision supports various image types
- CVPixelBufferRef： VideoDataOut
- CGImageRef: UIImage
- CIImage: Core Image
- NSURL: disk
- NSData: web
The image type to choose depends on where the image comes from
You shouldn’t have to pre-scale the image
Make sure to pass in the EXIF orientation of the image

What Am I Going to Do with the Image?

Interactively explore the image
- Use VNImageRequestHandler and hold onto it
- Remember that the input image is immutable
Tracking an observation
- Use VNSequenceRequestHandler
- Tracking state is kept in the VNSequenceRequestHandler
- Lifecycle of images is not tied to the life of the VNSequenceRequestHandler

What Performance Do I Need or Want?

Vision tasks can be time consuming and processing intensive

Dispatch your work on a queue with appropriate QOS
Use the completion handler to work with the results
Completion handler is called on the same queue as the request

Vision Framework - Building on Core ML

What You Can Do with Vision

On Device vs. Cloud

Vision Concepts

Which Image Type Is Right for Me?

What Am I Going to Do with the Image?

What Performance Do I Need or Want?

推荐阅读更多精彩内容