Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency

Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

75 Scopus citations

Abstract

We present an end-to-end joint training framework that explicitly models 6-DoF motion of multiple dynamic objects, ego-motion, and depth in a monocular camera setup without supervision. Our technical contributions are three-fold. First, we highlight the fundamental difference between inverse and forward projection while modeling the individual motion of each rigid object, and propose a geometrically correct projection pipeline using a neural forward projection module. Second, we design a unified instance-aware photometric and geometric consistency loss that holistically imposes self-supervisory signals for every background and object region. Lastly, we introduce a general-purpose auto-annotation scheme using any off-the-shelf instance segmentation and optical flow models to produce video instance segmentation maps that will be utilized as input to our training pipeline. These proposed elements are validated in a detailed ablation study. Through extensive experiments conducted on the KITTI and Cityscapes dataset, our framework is shown to outperform the state-of-the-art depth and motion estimation methods. Our code, dataset, and models are publicly available.

Original languageEnglish
Title of host publication35th AAAI Conference on Artificial Intelligence, AAAI 2021
PublisherAssociation for the Advancement of Artificial Intelligence
Pages1863-1872
Number of pages10
ISBN (Electronic)9781713835974
DOIs
StatePublished - 2021
Event35th AAAI Conference on Artificial Intelligence, AAAI 2021 - Virtual, Online
Duration: 2 Feb 20219 Feb 2021

Publication series

Name35th AAAI Conference on Artificial Intelligence, AAAI 2021
Volume3A

Conference

Conference35th AAAI Conference on Artificial Intelligence, AAAI 2021
CityVirtual, Online
Period2/02/219/02/21

Bibliographical note

Publisher Copyright:
Copyright © 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved

Fingerprint

Dive into the research topics of 'Learning Monocular Depth in Dynamic Scenes via Instance-Aware Projection Consistency'. Together they form a unique fingerprint.

Cite this