End-to-end High Dynamic Range Camera Pipeline Optimization (CVPR 2021)
With a 280 dB dynamic range, the real world is a High Dynamic Range (HDR) world. Today’s sensors cannot record this dynamic range in a single shot. Instead, HDR cameras acquire multiple measurements with different exposures, gains and photodiodes, from which an Image Signal Processor (ISP) reconstructs an HDR image. HDR image recovery for dynamic scenes is an open challenge because of motion and because stitched captures have different noise characteristics, resulting in artefacts that the ISP has to resolve – in real time and at triple-digit megapixel resolutions. Traditionally, hardware ISP settings used by downstream vision modules have been chosen by domain experts. Such frozen camera designs are then used for training data acquisition and supervised learning of downstream vision modules. We depart from this paradigm and formulate HDR ISP hyperparameter search as an end-to-end optimization problem. We propose a mixed 0th and 1st-order block coordinate descent optimizer to jointly learn ISP and detector network weights using RAW image data augmented with emulated SNR transition region artefacts. We assess the proposed method for human vision and image understanding. For automotive object detection, the method improves mAP and mAR by 33% compared to expert-tuning and by 22% compared to recent state-of-the-art. The method is validated in an HDR laboratory rig and in the field, outperforming conventional handcrafted HDR imaging and vision pipelines in all experiments.