Joint multi-scale transformers and pose equivalence constraints for 3D human pose estimation - Details

Author：

Wu, Y. (Wu, Y..) | Kong, D. (Kong, D..) | Gao, J. (Gao, J..) | Li, J. (Li, J..) | Yin, B. (Yin, B..)

Indexed by：

EI Scopus SCIE

Abstract：

Different　from　image-based　3D　pose　estimation,　video-based　3D　pose　estimation　gains　performance　improvement　with　temporal　information.　However,　these　methods　still　face　the　challenge　of　insufficient　generalization　ability,　including　human　motion　speed,　body　shape,　and　camera　distance.　To　address　the　above　problems,　we　propose　a　novel　approach,　referred　to　as　joint　Spatial–temporal　Multi-scale　Transformers　and　Pose　Transformation　Equivalence　Constraints　(SMT-PTEC)　for　3D　human　pose　estimation　from　videos.　We　design　a　more　general　spatial–temporal　multi-scale　feature　extraction　strategy,　and　introduce　optimization　constraints　that　adapt　to　the　diversity　of　data　to　improve　the　accuracy　of　pose　estimation.　Specifically,　we　first　introduce　a　spatial　multi-scale　transformer　to　extract　multi-scale　features　of　pose　and　establish　a　cross-scale　information　transfer　mechanism,　which　effectively　explores　the　underlying　knowledge　of　human　motion.　Then,　we　present　a　temporal　multi-scale　transformer　to　explore　multi-scale　dependencies　between　frames,　enhance　the　adaptability　of　the　network　to　human　motion　speed,　and　improve　the　estimation　accuracy　through　a　context　aware　fusion　of　multi-scale　predictions.　Moreover,　we　add　pose　transformation　equivalence　constraints　by　changing　the　training　samples　with　horizontal　flipping,　scaling,　and　body　shape　transformation　to　effectively　overcome　the　influence　of　camera　distance　and　body　shape　for　the　prediction　accuracy.　Extensive　experimental　results　demonstrate　that　our　approach　achieves　superior　performance　with　less　computational　complexity　than　previous　state-of-the-art　methods.　Code　is　available　at　https://github.com/JNGao123/SMT-PTEC.　©　2024　Elsevier　Inc.

Keyword：

Transformer Spatial–temporal multi-scale Pose transformation equivalence Pose estimation

Author Community：

[ 1 ] [Wu Y.]Beijing Institute of Artificial Intelligence, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 2 ] [Kong D.]Beijing Institute of Artificial Intelligence, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 3 ] [Gao J.]Beijing Institute of Artificial Intelligence, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 4 ] [Li J.]Beijing Institute of Artificial Intelligence, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
[ 5 ] [Yin B.]Beijing Institute of Artificial Intelligence, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Single-to-Multi Music Track Composition Using Interactive Chaotic Evolution
2024，
TCBFormer: A General Architecture Based on Dual-Branch Feature Fusion for Polyp Segmentation
2024，
Dictionary domain adaptation transformer for cross-machine fault diagnosis of rolling bearings
2024，ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
CT-MVSNet: Curvature-guided multi-view stereo with transformers
2024，Multimedia Tools and Applications

Source ：

Journal of Visual Communication and Image Representation

ISSN： 1047-3203

Year： 2024

Volume： 103

2 . 6 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to