Indexed by:
Abstract:
Depth maps have been used in many vision tasks due to the real-time acquisition and low cost of consumer depth cameras. However, they still suffer from low precision and severe sensor noise, even with the significant research in depth enhancement. We propose a novel multi-level feature fusion convolutional neural network (CNN) for facial depth map refinement named MFFNet. It is a multi-stage network, where each stage is a local multi-level feature fusion (LMLF) block. For smoothing the noise as well as boosting detailed facial structure, a hierarchical fusion strategy is adopted to fully fuse multi-level features, i.e., an LMLF block fuses multi-level features locally in each stage, while inter-stage skip connections are employed to reach a global multi-level feature fusion. Moreover, the inter-stage skip connections can also ease the training through shortening the information propagation paths. We introduce an effective data augmentation method to synthesize noisy facial depth maps of various poses. Training with these synthetic data improves the robustness of the proposed method to face poses. The proposed method is evaluated with a synthetic facial depth map dataset, a real Kinect V2 facial depth map dataset and the Middlebury Stereo Dataset. Experimental results show that our method produces refined depth maps with high quality and outperforms several state-of-the-art methods.
Keyword:
Reprint Author's Address:
Email:
Source :
SIGNAL PROCESSING-IMAGE COMMUNICATION
ISSN: 0923-5965
Year: 2022
Volume: 103
3 . 5
JCR@2022
3 . 5 0 0
JCR@2022
ESI Discipline: ENGINEERING;
ESI HC Threshold:49
JCR Journal Grade:2
CAS Journal Grade:3
Cited Count:
WoS CC Cited Count: 5
SCOPUS Cited Count: 7
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 14
Affiliated Colleges: