Indexed by:
Abstract:
Group activity recognition (GAR) remains challenging due to the diverse interactions among individuals across different group activities. Therefore, fully exploring the complex spatiotemporal interactions among multiple individuals in video scenes is key to GAR. To address this issue, we propose a novel GAR framework, which hierarchically fuses features at different levels from complementary spatiotemporal dual paths, enhancing the spatiotemporal interactions between individuals. Moreover, different from previous works, our framework further incorporates a unique contrastive loss function, which upholds consistency of the same individuals’ representations across paths at the instance level and amplifies distinction among features of individuals from distinct classes at the category level. This unique loss can help reduce noise interference in the representations of the dual path and avoid confusion between individual categories. Our method has been extensively evaluated on public datasets, demonstrating its superiority. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 1865-0929
Year: 2025
Volume: 2302 CCIS
Page: 247-261
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 10
Affiliated Colleges: