• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Kong, Jiangtao (Kong, Jiangtao.) | Xu, Rongchao (Xu, Rongchao.) | Xing, Junliang (Xing, Junliang.) | Li, Kai (Li, Kai.) | Ma, Wei (Ma, Wei.)

Indexed by:

CPCI-S

Abstract:

Recently, the Convolutional Networks (ConvNet) has become the dominated approach to the human activity classification problem. We investigate current standard ConvNet architectures and pinpoint one of their main limitations: the spatial-temporal dependency is simply captured by global pooling operation, which may not well capture the complex long term spatial-temporal relationships in videos. For this work, we propose a Spatial Temporal Attentional Glimpse (STAG) module to overcome this shortcoming. Specifically, the input to this STAG module is a 3D tensor which is first processed by a spatial-temporal attention block. Spatial Temporal Glimpse block decomposes the resulting tensor into two low dimensional tensors and then fuses their operation results. The proposed STAG module is pluggable, easy to learn, and effective in computation. We conduct extended ablation studies to show that our model incorporated with the STAG block substantially improves the performance over the state-of-the-art. All the experimental results, the trained models, and the complete source codes will be released to facilitate further studies on this problem.

Keyword:

Classification Human Action Deep Learning

Author Community:

  • [ 1 ] [Kong, Jiangtao]Beijing Univ Technol, Beijing, Peoples R China
  • [ 2 ] [Xu, Rongchao]Beijing Univ Technol, Beijing, Peoples R China
  • [ 3 ] [Ma, Wei]Beijing Univ Technol, Beijing, Peoples R China
  • [ 4 ] [Xing, Junliang]Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
  • [ 5 ] [Li, Kai]Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

Reprint Author's Address:

  • [Kong, Jiangtao]Beijing Univ Technol, Beijing, Peoples R China

Email:

Show more details

Related Keywords:

Source :

2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)

ISSN: 1522-4880

Year: 2019

Page: 4040-4044

Language: English

Cited Count:

WoS CC Cited Count: 0

SCOPUS Cited Count:

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 4

Affiliated Colleges:

Online/Total:513/10663229
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.