Learning Residual Flow as Dynamic Motion from Stereo Videos

Seokju Lee, Sunghoon Im, Stephen Lin, In So Kweon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

We present a method for decomposing the 3D scene flow observed from a moving stereo rig into stationary scene elements and dynamic object motion. Our unsupervised learning framework jointly reasons about the camera motion, optical flow, and 3D motion of moving objects. Three cooperating networks predict stereo matching, camera motion, and residual flow, which represents the flow component due to object motion and not from camera motion. Based on rigid projective geometry, the estimated stereo depth is used to guide the camera motion estimation, and the depth and camera motion are used to guide the residual flow estimation. We also explicitly estimate the 3D scene flow of dynamic objects based on the residual flow and scene depth. Experiments on the KITTI dataset demonstrate the effectiveness of our approach and show that our method outperforms other state-of-the-art algorithms on the optical flow and visual odometry tasks.

Original languageEnglish
Title of host publication2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1180-1186
Number of pages7
ISBN (Electronic)9781728140049
DOIs
StatePublished - Nov 2019
Event2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019 - Macau, China
Duration: 3 Nov 20198 Nov 2019

Publication series

NameIEEE International Conference on Intelligent Robots and Systems
ISSN (Print)2153-0858
ISSN (Electronic)2153-0866

Conference

Conference2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
Country/TerritoryChina
CityMacau
Period3/11/198/11/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Fingerprint

Dive into the research topics of 'Learning Residual Flow as Dynamic Motion from Stereo Videos'. Together they form a unique fingerprint.

Cite this