ASGSA: global semantic-aware network for action segmentation - Details

Author：

Bian, Qingyun (Bian, Qingyun.) | Zhang, Chun (Zhang, Chun.) | Ren, Keyan (Ren, Keyan.) | Yue, Tianyi (Yue, Tianyi.) | Zhang, Yunlu (Zhang, Yunlu.)

Indexed by：

EI Scopus

Abstract：

Action　segmentation　is　vital　for　video　understanding　because　it　heuristically　divides　complex　untrimmed　videos　into　short　semantic　clips.　Real-world　human　actions　exhibit　complex　temporal　dynamics,　encompassing　variations　in　duration,　rhythm,　and　range　of　motions,　etc.　While　deep　networks　have　been　successfully　applied　to　these　tasks,　they　face　challenges　in　effectively　adapting　to　these　complex　variations　due　to　the　inherent　difficulty　in　capturing　semantic　information　from　a　global　perspective.　Merely　relying　on　distinguishing　visual　representations　in　local　regions　leads　to　the　issue　of　over-segmentation.　In　an　attempt　to　address　this　practical　issue,　we　propose　a　novel　approach　named　ASGSA,　which　aims　to　obtain　smoother　segmentation　results　by　extracting　instructive　semantic　information.　Our　core　component,　Global　Semantic-Aware　module,　provides　an　effective　way　to　encode　the　long-range　temporal　relation　in　the　long　untrimmed　video.　Specifically,　we　exploit　a　hierarchical　temporal　context　aggregation,　which　is　identified　by　a　gated-mechanism　selection　to　control　the　information　passage　at　different　scales.　In　addition,　an　adaptive　fusion　strategy　is　designed　to　guide　the　segmentation　with　the　extracted　semantic　information.　Simultaneously,　to　obtain　higher-quality　video　representation　without　extra　annotations,　we　resort　to　self-supervised　training　strategy　and　propose　the　Video　Speed　Prediction　module.　Extensive　experiments　demonstrate　that　our　approach　achieves　state-of-the-art　performance　on　all　three　challenging　benchmark　datasets　(Breakfast,　50Salads,　GTEA)　and　significantly　improves　the　F1　score@50,　which　represents　the　reduction　of　over-segmentation.　The　code　is　available　at　https://github.com/ten000/ASGSA.　©　The　Author(s),　under　exclusive　licence　to　Springer-Verlag　London　Ltd.,　part　of　Springer　Nature　2024.

Keyword：

Benchmarking Semantic Segmentation Semantic Web Semantics Complex networks

Author Community：

[ 1 ] [Bian, Qingyun]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 2 ] [Zhang, Chun]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 3 ] [Ren, Keyan]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 4 ] [Yue, Tianyi]Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 5 ] [Zhang, Yunlu]China Mobile Research Institute, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Frequency-Based Matcher for Long-Tailed Semantic Segmentation
2024，IEEE Transactions on Multimedia
A DWT-Utilized Classifier for UPJO Diagnosis Using Ultrasound Images
2022，19th IEEE International Conference on Networking, Sensing and Control, ICNSC 2022
SCGTS: Semantic Content Guiding Teacher-Student Network for Group Activity Recognition
2023，18th Chinese Conference on Image and Graphics Technology and Application Conference, IGTA 2023
Generalization Boosted Adapter for Open-Vocabulary Segmentation
2025，IEEE Transactions on Circuits and Systems for Video Technology

Source ：

Neural Computing and Applications

ISSN： 0941-0643

Year： 2024

Issue： 22

Volume： 36

Page： 13629-13645

6 . 0 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to