Indexed by:
Abstract:
Although prior methods have achieved promising performance for recovering the 3D geometry from a single depth image, they tend to produce incomplete 3D shapes with noise. To this end, we propose Multi-Scale Latent Feature-Aware Network (MLANet) to recover the full 3D voxel grid from a single depth view of an object. MLANet logically represents a 3D voxel grid as visible voxels, occluded voxels and non-object voxels, and aims to the reconstruction of the latter two. Thus MLANet first introduces Multi-Scale Latent Feature-Aware (MLFA) based AutoEncoder (MLFA-AE) and a logical partition module to predict an occluded voxel grid (OccVoxGd) and a non-object voxel grid (NonVoxGd) from the visible voxel grid (VisVoxGd) corresponding to the input. MLANet then introduces MLFA based Generative Adversarial Network (MLFA-GAN) to refine the OccVoxGd and the NonVoxGd, and combines them with the VisVoxGd to generate a target 3D occupancy grid. MLFA shows a strong ability of learning multi-scale features of an object effectively and can be considered as a plug-and-play component to promote existing networks. The logical partition helps suppress NonVoxGd noise and improve OccVoxGd accuracy under adversarial constraints. Experimental studies on both synthetic and real-world data show that MLANet outperforms the state-of-the-art methods, and especially reconstructs unseen object categories with a higher accuracy. © 2023 Elsevier B.V.
Keyword:
Reprint Author's Address:
Email:
Source :
Neurocomputing
ISSN: 0925-2312
Year: 2023
Volume: 533
Page: 22-34
6 . 0 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:19
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 6
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 10
Affiliated Colleges: