Google’s ‘Frankenphone’ Helped Gather Training Data for Portrait Mode on the Pixel 3


Google’s ‘Frankenphone’ Helped Gather Training Data for Portrait Mode on the Pixel 3

Google has been leading the charge of computational photography in smartphones ever since the first Pixel, when it introduced its HDR+ technology. Since then, the search giant has not looked back and with the introduction of the Pixel 3₹ 59,999 series, it’s evident that it’s miles ahead of the competition when it comes to camera software. The Pixel 3 and Pixel 3 XL₹ 80,998 are the current flagship Android smartphones from Google and have some of the best cameras in the business, largely thanks to some new developments to its Portrait mode algorithm and new features like Night Sight and Super Res Zoom.

Earlier this week, Gadgets 360 was privy to a media roundtable with Marc Levoy, a distinguished engineer at Google’s research lab, where he talked in detail about how these new technologies were developed for the new smartphones.

In the Pixel 3 (Review) series, Google has revamped Portrait mode, as it moves away from stereo-based depth maps to a learning-based technique that uses machine learning to deliver more accurate edge detection and a more realistic background blur. Google has also released a blog post which details this but here’s a quick summary of how it works and what’s changed from Portrait mode on the Pixel 2₹ 28,879 (Review).

The Pixel 3 still has a single camera setup, like before, and continues to use the dual-pixels on the sensor to estimate a stereo depth map of most objects so it can separate the subject from the background, which gives you that bokeh or depth effect. However, this time, Google is also using machine learning for a more accurate segmentation of people from the background, for the rear camera. “It’s a computational neural network that estimates the probability of a person at each pixel in the image,” says Levoy. The selfie camera relies on this new learning-based technique too, as it lacks the dual-pixel autofocus system.

A combination of this learning-based, neural network, and dual-pixel allows the new algorithm to deliver a more realistic blur. This mean, the blur on objects behind the human subject varies depending on their distance from the subject. Levoy further states that even though the final result might look pleasing, it’s not an accurate representation of what a DSLR with a wide aperture lens might capture. The algorithm deliberately keeps a “zone of depth” around the person, just so things like the person’s hands, hair or other elements on the person is also in sharp focus, which “makes it easier for novices to take pictures.”

The thing about a neural network, is that while it’s very efficient once it’s up and running, it needs to be trained, which means feeding it with hundred and thousands of data sets first. To achieve this, Google built a specialised rig or a ‘Frankenphone’ as it calls it, consisting of five Pixel 3 phones which individually captured the same shot but with slightly varying perspectives. This let Google capture high-quality depth maps for very photo taken, which was used to train the neural network.

Some of the other stand-out features in the Pixel 3 phones is Night Sight and Super Res Zoom. Night Sight uses the multi-frame exposure technique, which is the basis for HDR+, to give you cleaner and brighter night time photos. We’ve looked at this feature in great detail and also tested how it works on all three generations of Pixel smartphones, which you can read about here. Super Res Zoom is a feature exclusive to the Pixel 3 and Pixel 3 XL (Review), which improves the quality of digitally zoomed images.