Tobias Gruber, Frank Julca-Aguilar, Mario Bijelic, Werner Ritter, Klaus Dietmayer, Felix Heide
We present an imaging framework which converts three images from a gated camera into high-resolution depth maps with depth resolution comparable to pulsed lidar measurements. Existing scanning lidar systems achieve low spatial resolution at large ranges due to mechanically-limited angular sampling rates, restricting scene understanding tasks to close-range clusters with dense sampling. In addition, today’s lidar detector technologies, short-pulsed laser sources and scanning mechanics result in high cost, power consumption and large form-factors.
We depart from point scanning and propose a learned architecture that recovers high-fidelity dense depth from three temporally gated images, acquired with a flash source and a high-resolution CMOS sensor. The proposed architecture exploits semantic context across gated slices, and is trained on a synthetic discriminator loss without the need of dense depth labels.
The method is real-time and essentially turns a gated camera into a low-cost dense flash lidar which we validate on a wide range of outdoor driving captures and in simulations.