How Apple used deep neural networks to bring face detection to iPhone and iPad

'Apple started using deep learning for face detection in iOS 10. With the release of the Vision framework, developers can now use this technology and many other computer vision algorithms in their apps.'

Apple doesn't want to store your data on its servers. There's just no way to guarantee your privacy once your data leaves your device. But providing services on-device is a huge challenge as well.

From the Apple Machine Learning Journal:

We faced several challenges. The deep-learning models need to be shipped as part of the operating system, taking up valuable NAND storage space. They also need to be loaded into RAM and require significant computational time on the GPU and/or CPU. Unlike cloud-based services, whose resources can be dedicated solely to a vision problem, on-device computation must take place while sharing these system resources with other running applications. Finally, the computation must be efficient enough to process a large Photos library in a reasonably short amount of time, but without significant power usage or thermal increase.

The rest of this article discusses our algorithmic approach to deep-learning-based face detection, and how we successfully met the challenges to achieve state-of-the-art accuracy. We discuss:

-how we fully leverage our GPU and CPU (using BNNS and Metal) -memory optimizations for network inference, and image loading and caching -how we implemented the network in a way that did not interfere with the multitude of other simultaneous tasks expected of iPhone.

Fascinating insight into the direction more and more of our computational experiences are going. Read the journal for much more, including flow diagrams.

Also, check out the WWDC 2017 session on the Core ML Vision framwork for a really good breakdown of the results.

Comments are closed.