Vision Framework - Building on Core ML

What You Can Do with Vision

  1. Face Detection
    • Small Faces
    • Strong Profiles
    • Partially Occluded
    • Hats and Glasses
    • Face Landmarks
  2. Image Registration
  3. Rectangle Detection
  4. Barcode Detection
  5. Text Detection
  6. Object Tracking
    • faces
    • rectangles
    • general templates
  7. Integration with Core ML

On Device vs. Cloud

  1. Privacy
    • Images and video stay on device
  2. Cost
    • No usage fees
    • No data transfer
  3. Real-time use cases
    • No latency, fast execution

Vision Concepts

  1. Analyzing an Image


    2dd7f001-33d3-4fc8-860a-ecbff3b0967c.png

    36d6f7e2-e797-49fd-badf-39c9927cf1e8.png
  2. Tracking in a Sequence


    d9d9bd9a-c67a-4fa4-97c1-4dcc51ef5be4.png
  3. Image Request Handler
    • For interactive exploration of an image
    • Holds on to the image for its lifecycle
    • Allows optimization of various requests performed on an image
  4. Sequence Request Handler
    • For anything that looks at images in a sequence like tracking
    • Does not optimize for multiple requests on an image
  5. Putting It into Code
// Create request
let faceDetectionRequest = VNDetectFaceRectanglesRequest()

 // Create request handle
let myRequestHandler = (url: fileURL, options: [:]) 

// send the requests to the request handler
myRequestHandler.perform([faceDetectionRequest])

// Do we have a face
for observation in faceDetectionRequest.results as! [VNFaceObservation] { /// do something
}
 // Create a sequence request handler
let requestHandler = VNSequenceRequestHandler()

// Start the tracking with an observation
let observations = detectionRequest.results  as! [VNDetectedObjectObservation]
let objectsToTrack = observations.map { VNTrackObjectRequest(detectedObjectObservation: $0) }

// Run the requests
requestHandler.perform(objectsToTrack, on: pixelBuffer)

// Lets look at the results
for request in objectsToTrack
for observation in request.results as! [VNDetectedObjectObservation]

Which Image Type Is Right for Me?

  1. Vision supports various image types
    • CVPixelBufferRef: VideoDataOut
    • CGImageRef: UIImage
    • CIImage: Core Image
    • NSURL: disk
    • NSData: web
  2. The image type to choose depends on where the image comes from
  3. You shouldn’t have to pre-scale the image
  4. Make sure to pass in the EXIF orientation of the image

What Am I Going to Do with the Image?

  1. Interactively explore the image
    • Use VNImageRequestHandler and hold onto it
    • Remember that the input image is immutable
  2. Tracking an observation
    • Use VNSequenceRequestHandler
    • Tracking state is kept in the VNSequenceRequestHandler
    • Lifecycle of images is not tied to the life of the VNSequenceRequestHandler

What Performance Do I Need or Want?

Vision tasks can be time consuming and processing intensive

  • Dispatch your work on a queue with appropriate QOS
  • Use the completion handler to work with the results
  • Completion handler is called on the same queue as the request
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 那次去苏州,到时已是傍晚。先找住处,宿于“如家”酒店,放下行李,到附近的家乐福超市去购物。第一次到苏州,走进...
    松茯苓阅读 317评论 0 0
  • 当我发现我已到了该成家的年纪 但我的女人呢,但我的女人呢 当我习惯把实话都变成了童话 那我的单纯呢,那我的单纯呢 ...
    一支笔的陪伴阅读 754评论 0 5
  • 17岁那年,暗恋五年的男生答应了情人一起节去看电影,结束的时候,他说,“我都知道,我们在一起吧。”灯还没亮起,光线...
    神田优阅读 519评论 0 0
  • 多少次轮回就有多少次伤悲! --题记 看着一片片飘落的金叶,再伟岸挺拔的古树也禁不住伤悲…… 不断地祈祷,不断地期...
    鹤洺阅读 179评论 6 6