Indexed by:
Abstract:
For visual-based robotic manipulation, it has always been a challenging task to perform real-time and accurate pose estimation of target objects under cluttered background, illumination variations, occlusion, and weak texture, especially under severe occlusion conditions. In recent years, the RGB-based methods based on vector field prediction are proved to be robustness on 6D object pose estimation under occlusion. At the same time, network with attention mechanism has achieved outstanding performance in 2D object detection. In this paper, we propose an attention-driven 6D pose estimation method with multi-constraints loss and pixel-wise voting. We calculate the distance weighted unit vector length and included angle length based on prediction results to regularize unit vectors prediction. Moreover, we introduce Dense Atrous Spatial Pyramid Pooling (DenseASPP) and Channel-wise Cross Attention (CCA) mechanisms into the network structure to improve the accuracy of output prediction. Experiments on LINEMOD and Occlusion LINEMOD datasets manifest that our method outperforms state-of-the-art two-stage sparse 2D keypoints prediction methods without pose refinement. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 0302-9743
Year: 2022
Volume: 13719 LNCS
Page: 35-47
Language: English
Cited Count:
SCOPUS Cited Count: 1
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 5
Affiliated Colleges: