• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Xi, Zeyu (Xi, Zeyu.) | Zhou, Xinlang (Zhou, Xinlang.) | Liu, Zilin (Liu, Zilin.) | Wu, Lifang (Wu, Lifang.)

Indexed by:

EI Scopus

Abstract:

Temporal text localization (TTL) task refers to identify a segment within a long untrimmed video that semantically matches a given textual query. However, most methods require extensive manual annotation of temporal boundaries for each query, which restricts their scalability and practicality in real-world applications. Moreover, modeling temporal context information is particularly crucial for TTL task. In this paper, a Vision Token Rolling Transformer for weakly supervised temporal text localization (VTR-former) is developed. VTR-former does not rely on predefined temporal boundaries during training or testing. It significantly improves the performance of the model in temporal information capture and feature representation by rolling vision tokens and utilizing advanced feature learning modules based on the transformer. Experiments on two challenging benchmarks, including Charades-STA and ActivityNet Captions, demonstrate that VTR-former outperforms the baseline network and achieves the leading performance. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

Keyword:

Electric transformer testing Semantics

Author Community:

  • [ 1 ] [Xi, Zeyu]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 2 ] [Zhou, Xinlang]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 3 ] [Liu, Zilin]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China
  • [ 4 ] [Wu, Lifang]Faculty of Information Technology, Beijing University of Technology, Beijing; 100124, China

Reprint Author's Address:

Email:

Show more details

Related Keywords:

Related Article:

Source :

ISSN: 1865-0929

Year: 2025

Volume: 2302 CCIS

Page: 203-217

Language: English

Cited Count:

WoS CC Cited Count:

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 4

Affiliated Colleges:

Online/Total:494/10576864
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.