Indexed by:
Abstract:
With the constructing of large volume video dataset and the rapid development of machine vision technology, action recognition in videos has become a hot topic in many applications. The appearance and tempo of human actions is variant in spatial and temporal space. It’s necessary to focus on the detailed descriptions of the fast and slow variations in an action category. We embed the spatial–temporal attention module into the SlowFast networks, to increase the describing ability of fast and slow motion changes. The accuracy of the proposed method is effectively improved on UCF-101 and HMDB-51 datasets. Experiments validate the effectiveness of the networks with embedded spatial–temporal attention module for discriminating variant motion tempos. The proposed method is able to capture more detailed description of action categories from the slow and fast pathways and present a more semantic recognition result. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 2190-3018
Year: 2023
Volume: 341
Page: 169-179
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 8
Affiliated Colleges: