Attention-gated LSTM for image captioning - Details

Author：

Xu, Cheng (Xu, Cheng.) | Ji, Junzhong (Ji, Junzhong.) (Scholars：冀俊忠) | Zhang, Menglong (Zhang, Menglong.) | Zhang, Xiaodan (Zhang, Xiaodan.)

Indexed by：

Abstract：

Attention　mechanism　has　achieved　remarkable　success　in　image　captioning　under　the　neural　encoder-decoder　frameworks.　However,　these　methods　are　limited　to　introduce　attention　to　the　language　model,　e.g.,　LSTM　(long　short-term　memory),　straightforwardly:　the　attention　is　embedded　into　LSTM　outside　the　core　hidden　layer,　and　the　current　attention　is　irrelevant　to　the　previous　one.　In　this　paper,　through　exploring　the　inner　relationship　of　attention　mechanism　and　the　gates　of　LSTM,　we　propose　a　new　attention-gated　LSTM　model　(AGL)　that　introduces　dynamic　attention　to　the　language　model.　In　this　method,　the　visual　attention　is　incorporated　into　the　output　gate　of　LSTM　and　propagates　along　with　the　sequential　cell　state.　Thus　the　attention　in　AGL　obtains　dynamic　characteristics,　which　means　the　current　focused　visual　region　can　give　remote　guidance　to　the　later　state.　Quantitative　and　qualitative　experiments　conducted　on　the　MS　COCO　dataset　demonstrate　the　the　advantage　of　our　proposed　method.　©　2019　IEEE.

Keyword：

Computational linguistics Dynamics Long short-term memory Behavioral research

Author Community：

[ 1 ] [Xu, Cheng]Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, China
[ 2 ] [Xu, Cheng]Beijing Artificial Intelligence Institute, Beijing University of Technology, China
[ 3 ] [Ji, Junzhong]Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, China
[ 4 ] [Ji, Junzhong]Beijing Artificial Intelligence Institute, Beijing University of Technology, China
[ 5 ] [Zhang, Menglong]Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, China
[ 6 ] [Zhang, Menglong]Beijing Artificial Intelligence Institute, Beijing University of Technology, China
[ 7 ] [Zhang, Xiaodan]Beijing Municipal Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, China
[ 8 ] [Zhang, Xiaodan]Beijing Artificial Intelligence Institute, Beijing University of Technology, China

Reprint Author's Address：

[zhang, xiaodan]beijing artificial intelligence institute, beijing university of technology, china;;[zhang, xiaodan]beijing municipal key laboratory of multimedia and intelligent software technology, beijing university of technology, china

Email：

zhangxiaodan@bjut.edu.cn

Show more details

Related Keywords：