Bottom-Up Progressive Semantic Alignment for Image-Text Retrieval - Details

Author：

Cui, Zheng (Cui, Zheng.) | Hu, Yongli (Hu, Yongli.) | Sun, Yanfeng (Sun, Yanfeng.) | Gao, Junbin (Gao, Junbin.) | Yin, Baocai (Yin, Baocai.)

Indexed by：

EI Scopus

Abstract：

Image-text　retrieval　is　a　challenging　task　due　to　image　and　text　are　heterogeneous　cross-modal　data,　which　possess　semantic　gap.　The　key　issue　of　image-text　retrieval　is　how　to　learn　a　common　feature　space　while　semantic　correspondence　between　image　and　text　remains.　Some　existing　works　extract　region　feature　in　image　and　word　feature　in　text　to　implement　cross-modal　alignment　between　local　elements,　the　other　works　integrate　relation-aware　information　to　local　elements　to　compute　cross-modal　similarity,　while　these　methods　not　utilize　the　semantic　information　in　different　semantic-level.　In　order　to　address　this　issue,　we　propose　a　Bottom-up　Progressive　Semantic　Alignment　(BPSA)　network,　in　which　precise　fine-grained　alignment　is　carried　out　on　diverse　semantic-levels　progressively.　Specifically,　the　feature　of　the　cross-modal　data　are　extracted　from　bottom　element　to　local-group,　and　global-representation　by　graph　convolution　and　attention　mechanism.　We　conduct　extensive　experiments　on　Flickr30K　and　MS-COCO　datasets,　compared　with　the　related　state-of-the-art　methods.　The　results　show　that　our　network　achieves　competitive　performance.　©　2021,　Springer　Nature　Switzerland　AG.

Keyword：

Information retrieval Modal analysis Semantics Alignment

Author Community：

[ 1 ] [Cui, Zheng]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 2 ] [Hu, Yongli]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 3 ] [Sun, Yanfeng]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China
[ 4 ] [Gao, Junbin]Discipline of Business Analytics, The University of Sydney Business School, The University of Sydney, Sydney; NSW, Australia
[ 5 ] [Yin, Baocai]Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing University of Technology, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

A semantic retrieval method based on ontology
2014，4th International Conference on Materials Science and Information Technology, MSIT 2014
Exploring Chinese word embedding with similar context and reinforcement learning
2022，Neural Computing and Applications
A fast image classification algorithm using Support Vector Machine
2010，
Implicit semantic text retrieval and distributed implementation for rural medical care
2016，4th IEEE International Conference on Cloud Computing and Intelligence Systems, CCIS 2016

Source ：

ISSN： 1865-0929

Year： 2021

Volume： 1517 CCIS

Page： 417-424

Language： English

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 3

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to