Domain-aware Prototype Network for Generalized Zero-Shot Learning - Details

Author：

Hu, Y. (Hu, Y..) | Feng, L. (Feng, L..) | Jiang, H. (Jiang, H..) | Liu, M. (Liu, M..) | Yin, B. (Yin, B..)

Indexed by：

EI Scopus SCIE

Abstract：

Generalized　zero-shot　learning(GZSL)　aims　to　recognize　images　from　seen　and　unseen　classes　with　side　information,　such　as　manually　annotated　attribute　vectors.　Traditional　methods　focus　on　mapping　images　and　semantics　into　a　common　latent　space,　thus　achieving　the　visual-semantics　alignment.　Since　the　unseen　classes　are　unavailable　during　training,　there　is　a　serious　problem　of　recognition　bias,　which　will　tend　to　recognize　unseen　classes　as　seen　classes.　To　solve　this　problem,　we　propose　a　Domain-aware　Prototype　Network(DPN),　which　splits　the　GZSL　problem　into　the　seen　class　recognition　and　unseen　class　recognition　problem.　For　the　seen　classes,　we　design　a　domain-aware　prototype　learning　branch　with　a　dual　attention　feature　encoder　to　capture　the　essential　visual　information,　which　aims　to　recognize　the　seen　classes　and　discriminate　the　novel　categories.　To　further　recognize　the　fine-grained　unseen　classes,　a　visual-semantic　embedding　branch　is　designed,　which　aims　to　align　the　visual　and　semantic　information　for　unseen-class　recognition.　Through　the　multi-task　learning　of　the　prototype　learning　branch　and　visual-semantic　embedding　branch,　our　model　can　achieve　excellent　performance　on　three　popular　GZSL　datasets.　IEEE

Keyword：

Transformers Feature extraction transformer-based dual attention Task analysis Visualization Generalized Zero-Shot Learning Image recognition domain detection Semantics Prototypes

Author Community：

[ 1 ] [Hu Y.]Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing, China
[ 2 ] [Feng L.]Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing, China
[ 3 ] [Jiang H.]Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing, China
[ 4 ] [Liu M.]Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing, China
[ 5 ] [Yin B.]Beijing Institute of Artificial Intelligence, Faculty of Information Technology, Beijing Key Laboratory of Multimedia and Intelligent Software Technology, Beijing University of Technology, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

TransIFC: Invariant Cues-aware Feature Concentration Learning for Efficient Fine-grained Bird Image Classification
2023，IEEE Transactions on Multimedia
DHHG-TAC: Fusion of Dynamic Heterogeneous Hypergraphs and Transformer Attention Mechanism for Visual Question Answering Tasks
2024，IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
OASNet: Object Affordance State Recognition Network with Joint Visual Features and Relational Semantic Embeddings
2023，IEEE Transactions on Circuits and Systems for Video Technology
Learning Label Semantics for Weakly Supervised Group Activity Recognition
2024，IEEE Transactions on Multimedia

Source ：

IEEE Transactions on Circuits and Systems for Video Technology

ISSN： 1051-8215

Year： 2023

Issue： 5

Volume： 34

Page： 1-1

8 . 4 0 0

JCR@2022

ESI Discipline： ENGINEERING;

ESI HC Threshold：19

Cited Count：

WoS CC Cited Count： 0

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to