Indexed by:
Abstract:
Geometric and semantic contexts are essential to solving the ill-posed problem of monocular depth estimation (MDE). In this paper, we propose a deep MDE framework that can aggregate dual-modal structural contexts for monocular depth estimation (DSC-MDE). First, a cross-shaped context (CSC) aggregation module is developed to globally encode the geometric structures in depth maps observed under the fields of vision of robots/autonomous vehicles. Next, the CSC-encoded geometric features are further modulated with semantic context in an object-regional context (ORC) aggregation module. Finally, to train the proposed network, we present a focal ordinal loss (FOL), which pays more attention to distant samples to avoid the issue of over-relaxed constraints on these samples occurring in the ordinal regression loss (ORL). We compare the proposed model to recent methods with geometric and multi-modal contexts, and show that the proposed model obtains state-of-the-art performance on both indoor and outdoor datasets, including NYU-Depth-V2, Cityscapes and KITTI. (c) 2023 Elsevier B.V. All rights reserved.
Keyword:
Reprint Author's Address:
Email:
Source :
KNOWLEDGE-BASED SYSTEMS
ISSN: 0950-7051
Year: 2023
Volume: 263
8 . 8 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:19
Cited Count:
WoS CC Cited Count: 3
SCOPUS Cited Count: 4
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 3
Affiliated Colleges: