Indexed by:
Abstract:
Human gesture recognition is an important research field of human-computer interaction due to its potential applications in various fields, but existing methods still face challenges in achieving high levels of accuracy. To address this issue, some existing researches propose to fuse the global features with the cropped features called focuses on vital body parts like hands. However, most methods rely on experience when choosing the focus, the scheme of focus selection is not discussed in detail. In this paper, a hierarchical body part combination method is proposed to take into account the number, combinations, and logical relationships between body parts. The proposed method generates multiple focuses using this method and employs chart-based surface modality alongside red-green-blue and optical flow modalities to enhance each focus. A feature-level fusion scheme based on the residual connection structure is proposed to fuse different modalities at convolution stages, and a focus fusion scheme is proposed to learn the relevancy of focus channels for each gesture class individually. Experiments conducted on ChaLearn isolated gesture dataset show that the use of multiple focuses in conjunction with multi-modal features and fusion strategies leads to better gesture recognition accuracy.
Keyword:
Reprint Author's Address:
Source :
TSINGHUA SCIENCE AND TECHNOLOGY
ISSN: 1007-0214
Year: 2025
Issue: 4
Volume: 30
Page: 1583-1599
6 . 6 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 14
Affiliated Colleges: