TY - JOUR
T1 - A Large-Scale Virtual Dataset and Egocentric Localization for Disaster Responses
AU - Jeon, Hae Gon
AU - Im, Sunghoon
AU - Lee, Byeong Uk
AU - Rameau, Francois
AU - Choi, Dong Geol
AU - Oh, Jean
AU - Kweon, In So
AU - Hebert, Martial
N1 - Publisher Copyright:
© 1979-2012 IEEE.
PY - 2023/6/1
Y1 - 2023/6/1
N2 - With the increasing social demands of disaster response, methods of visual observation for rescue and safety have become increasingly important. However, because of the shortage of datasets for disaster scenarios, there has been little progress in computer vision and robotics in this field. With this in mind, we present the first large-scale synthetic dataset of egocentric viewpoints for disaster scenarios. We simulate pre- and post-disaster cases with drastic changes in appearance, such as buildings on fire and earthquakes. The dataset consists of more than 300K high-resolution stereo image pairs, all annotated with ground-truth data for the semantic label, depth in metric scale, optical flow with sub-pixel precision, and surface normal as well as their corresponding camera poses. To create realistic disaster scenes, we manually augment the effects with 3D models using physically-based graphics tools. We train various state-of-the-art methods to perform computer vision tasks using our dataset, evaluate how well these methods recognize the disaster situations, and produce reliable results of virtual scenes as well as real-world images. We also present a convolutional neural network-based egocentric localization method that is robust to drastic appearance changes, such as the texture changes in a fire, and layout changes from a collapse. To address these key challenges, we propose a new model that learns a shape-based representation by training on stylized images, and incorporate the dominant planes of query images as approximate scene coordinates. We evaluate the proposed method using various scenes including a simulated disaster dataset to demonstrate the effectiveness of our method when confronted with significant changes in scene layout. Experimental results show that our method provides reliable camera pose predictions despite vastly changed conditions.
AB - With the increasing social demands of disaster response, methods of visual observation for rescue and safety have become increasingly important. However, because of the shortage of datasets for disaster scenarios, there has been little progress in computer vision and robotics in this field. With this in mind, we present the first large-scale synthetic dataset of egocentric viewpoints for disaster scenarios. We simulate pre- and post-disaster cases with drastic changes in appearance, such as buildings on fire and earthquakes. The dataset consists of more than 300K high-resolution stereo image pairs, all annotated with ground-truth data for the semantic label, depth in metric scale, optical flow with sub-pixel precision, and surface normal as well as their corresponding camera poses. To create realistic disaster scenes, we manually augment the effects with 3D models using physically-based graphics tools. We train various state-of-the-art methods to perform computer vision tasks using our dataset, evaluate how well these methods recognize the disaster situations, and produce reliable results of virtual scenes as well as real-world images. We also present a convolutional neural network-based egocentric localization method that is robust to drastic appearance changes, such as the texture changes in a fire, and layout changes from a collapse. To address these key challenges, we propose a new model that learns a shape-based representation by training on stylized images, and incorporate the dominant planes of query images as approximate scene coordinates. We evaluate the proposed method using various scenes including a simulated disaster dataset to demonstrate the effectiveness of our method when confronted with significant changes in scene layout. Experimental results show that our method provides reliable camera pose predictions despite vastly changed conditions.
KW - Large-scale dataset
KW - camera relocalization
KW - disaster scenarios
KW - egocentric localization
KW - visual odometry
UR - http://www.scopus.com/inward/record.url?scp=85110925764&partnerID=8YFLogxK
U2 - 10.1109/TPAMI.2021.3094531
DO - 10.1109/TPAMI.2021.3094531
M3 - Article
AN - SCOPUS:85110925764
SN - 0162-8828
VL - 45
SP - 6766
EP - 6782
JO - IEEE Transactions on Pattern Analysis and Machine Intelligence
JF - IEEE Transactions on Pattern Analysis and Machine Intelligence
IS - 6
ER -