• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Tan, Hongchen (Tan, Hongchen.) | Yin, Baocai (Yin, Baocai.) (Scholars:尹宝才) | Xu, Kaiqiang (Xu, Kaiqiang.) | Wang, Huasheng (Wang, Huasheng.) | Liu, Xiuping (Liu, Xiuping.) | Li, Xin (Li, Xin.)

Indexed by:

EI Scopus SCIE

Abstract:

We propose a novel Text-to-Image Generation Network, Attention-bridged Modal Interaction Generative Adversarial Network (AMI-GAN), to better explore modal interaction and perception for high-quality image synthesis. The AMI-GAN contains two novel designs: an Attention-bridged Modal Interaction (AMI) module and a Residual Perception Discriminator (RPD). In AMI, we mainly design a multi-scale attention mechanism to exploit semantics alignment, fusion, and enhancement between text and image, to better refine details and context semantics of the synthesized image. In RPD, we design a multi-scale information perception mechanism with our proposed novel information adjustment function, to encourage the discriminator to better perceive visual differences between the real and synthesized image. Consequently, the discriminator will drive the generator to improve the visual quality of the synthesized image. Besides, based on these novel designs, we can design two versions, a single-stage generation framework (AMI-GAN-S), and a multi-stage generation framework (AMI-GAN-M), respectively. The former can synthesize high-resolution images because of its low computational cost; the latter can synthesize images with realistic detail. Experimental results on two widely used T2I datasets showed that our AMI-GANs achieve competitive performance in T2I task.

Keyword:

Layout Generative adversarial network attention-bridged modal interaction residual perception discriminator Visualization Computational modeling text-to-image synthesis Generators Task analysis Semantics Image synthesis

Author Community:

  • [ 1 ] [Tan, Hongchen]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
  • [ 2 ] [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
  • [ 3 ] [Xu, Kaiqiang]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 116024, Peoples R China
  • [ 4 ] [Liu, Xiuping]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 116024, Peoples R China
  • [ 5 ] [Wang, Huasheng]Cardiff Univ, Sch Comp Sci Informat, Cardiff CF10 3AT, Wales
  • [ 6 ] [Li, Xin]Texas A&M Univ, Sch Performance Visualizat & Fine Arts, Sect Visual Comp & Creat Media, College Stn, TX 77843 USA

Reprint Author's Address:

  • [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China;;

Show more details

Related Keywords:

Source :

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY

ISSN: 1051-8215

Year: 2024

Issue: 7

Volume: 34

Page: 5400-5413

8 . 4 0 0

JCR@2022

Cited Count:

WoS CC Cited Count: 3

SCOPUS Cited Count: 3

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 6

Affiliated Colleges:

Online/Total:463/10571468
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.