• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Liu, Caixia (Liu, Caixia.) | Kong, Dehui (Kong, Dehui.) (Scholars:孔德慧) | Wang, Shaofan (Wang, Shaofan.) | Li, Jinghua (Li, Jinghua.) | Yin, Baocai (Yin, Baocai.) (Scholars:尹宝才)

Indexed by:

EI Scopus SCIE

Abstract:

Although deep networks based methods outperform traditional 3D reconstruction methods which require multiocular images or class labels to recover the full 3D geometry, they may produce incomplete recovery and unfaithful reconstruction when facing occluded parts of 3D objects. To address these issues, we propose Depth-preserving Latent Generative Adversarial Network (DLGAN) which consists of 3D Encoder-Decoder based GAN (EDGAN, serving as a generator and a discriminator) and Extreme Learning Machine (ELM, serving as a classifier) for 3D reconstruction from a monocular depth image of an object. Firstly, EDGAN decodes a latent vector from the 2.5D voxel grid representation of an input image, and generates the initial 3D occupancy grid under common GAN losses, a latent vector loss and a depth loss. For the latent vector loss, we design 3D deep AutoEncoder (AE) to learn a target latent vector from ground truth 3D voxel grid and utilize the vector to penalize the latent vector encoded from the input 2.5D data. For the depth loss, we utilize the input 2.5D data to penalize the initial 3D voxel grid from 2.5D views. Afterwards, ELM transforms float values of the initial 3D voxel grid to binary values under a binary reconstruction loss. Experimental results show that DLGAN not only outperforms several state-of-the-art methods by a large margin on both a synthetic dataset and a real-world dataset, but also predicts more occluded parts of 3D objects accurately without class labels.

Keyword:

depth loss monocular depth image Three-dimensional displays 3D reconstruction Transforms ELM Image reconstruction Shape Generative adversarial networks latent vector Two dimensional displays Gallium nitride

Author Community:

  • [ 1 ] [Liu, Caixia]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 2 ] [Kong, Dehui]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 3 ] [Wang, Shaofan]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 4 ] [Li, Jinghua]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China
  • [ 5 ] [Yin, Baocai]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

Reprint Author's Address:

  • [Wang, Shaofan]Beijing Univ Technol, Fac Informat Technol, Beijing Artificial Intelligence Inst, Beijing Key Lab Multimedia & Intelligent Software, Beijing 100124, Peoples R China

Show more details

Related Keywords:

Source :

IEEE TRANSACTIONS ON MULTIMEDIA

ISSN: 1520-9210

Year: 2021

Volume: 23

Page: 2843-2856

7 . 3 0 0

JCR@2022

ESI Discipline: COMPUTER SCIENCE;

ESI HC Threshold:87

JCR Journal Grade:1

Cited Count:

WoS CC Cited Count: 11

SCOPUS Cited Count: 14

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 10

Online/Total:715/10555505
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.