DPSNet: End-to-end deep plane sweep stereo

Research output: Contribution to conferencePaperpeer-review

107 Scopus citations

Abstract

Multiview stereo aims to reconstruct scene depth from images acquired by a camera under arbitrary motion. Recent methods address this problem through deep learning, which can utilize semantic cues to deal with challenges such as textureless and reflective regions. In this paper, we present a convolutional neural network called DPSNet (Deep Plane Sweep Network) whose design is inspired by best practices of traditional geometry-based approaches for dense depth reconstruction. Rather than directly estimating depth and/or optical flow correspondence from image pairs as done in many previous deep learning methods, DPSNet takes a plane sweep approach that involves building a cost volume from deep features using the plane sweep algorithm, regularizing the cost volume via a context-aware cost aggregation, and regressing the dense depth map from the cost volume. The cost volume is constructed using a differentiable warping process that allows for end-to-end training of the network. Through the effective incorporation of conventional multiview stereo concepts within a deep learning framework, DPSNet achieves state-of-the-art reconstruction results on a variety of challenging datasets.

Original languageEnglish
StatePublished - 2019
Event7th International Conference on Learning Representations, ICLR 2019 - New Orleans, United States
Duration: 6 May 20199 May 2019

Conference

Conference7th International Conference on Learning Representations, ICLR 2019
Country/TerritoryUnited States
CityNew Orleans
Period6/05/199/05/19

Bibliographical note

Publisher Copyright:
© 7th International Conference on Learning Representations, ICLR 2019. All Rights Reserved.

Fingerprint

Dive into the research topics of 'DPSNet: End-to-end deep plane sweep stereo'. Together they form a unique fingerprint.

Cite this