Abstract
Deep learning (DL) techniques for precise semantic segmentation have remained a challenge because of the vague boundaries of target objects caused by the low resolution of images. Despite the improved segmentation performance using up/downsampling operations in early DL models, conventional operators cannot fully preserve spatial information and thus generate vague boundaries of target objects. Therefore, for the precise segmentation of target objects in many domains, this paper presents two novel operators: (1) upsampling interpolation method (USIM), an operator that upsamples input feature maps and combines feature maps into one while preserving the spatial information of both inputs, and (2) USIM gate (UG), an advanced USIM operator with boundary-attention mechanisms. We designed our experiments using aerial images where the boundaries critically influence the results. Furthermore, we verified the feasibility that our approach effectively segments target objects using the Cityscapes dataset. The experimental results demonstrate that using the USIM and UG with state-of-the-art DL models can improve the segmentation performance with clear boundaries of target objects (Intersection over Union: +6.9%; Boundary Jaccard: +10.1%). Furthermore, mathematical proofs verify that the USIM and UG contribute to the handling of spatial information.
Original language | English |
---|---|
Pages (from-to) | 535-562 |
Number of pages | 28 |
Journal | Proceedings of Machine Learning Research |
Volume | 206 |
State | Published - 2023 |
Event | 26th International Conference on Artificial Intelligence and Statistics, AISTATS 2023 - Valencia, Spain Duration: 25 Apr 2023 → 27 Apr 2023 |
Bibliographical note
Publisher Copyright:Copyright © 2023 by the author(s)