ALR-GAN: Adaptive Layout Refinement for Text-to-Image Synthesis - Details

Author：

Tan, Hongchen (Tan, Hongchen.) | Yin, Baocai (Yin, Baocai.) | Wei, Kun (Wei, Kun.) | Liu, Xiuping (Liu, Xiuping.) | Li, Xin (Li, Xin.)

Indexed by：

EI Scopus SCIE

Abstract：

We　propose　a　novel　Text-to-Image　Generation　Network,　Adaptive　Layout　Refinement　Generative　Adversarial　Network　(ALR-GAN),　to　adaptively　refine　the　layout　of　synthesized　images　without　any　auxiliary　information.　The　ALR-GAN　includes　an　Adaptive　Layout　Refinement　(ALR)　module　and　a　Layout　Visual　Refinement　(LVR)　loss.　The　ALR　module　aligns　the　layout　structure　(which　refers　to　locations　of　objects　and　background)　of　a　synthesized　image　with　that　of　its　corresponding　real　image.　In　ALR　module,　we　proposed　an　Adaptive　Layout　Refinement　(ALR)　loss　to　balance　the　matching　of　hard　and　easy　features,　for　more　efficient　layout　structure　matching.　Based　on　the　refined　layout　structure,　the　LVR　loss　further　refines　the　visual　representation　within　the　layout　area.　Experimental　results　on　two　widely-used　datasets　show　that　ALR-GAN　performs　competitively　at　the　Text-to-Image　generation　task.

Keyword：

text-to-image synthesis information consistency constraint Task analysis Adaptation models Training Visualization Semantics object layout refinement Layout Generators Generative adversarial network

Author Community：

[ 1 ] [Tan, Hongchen]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 2 ] [Yin, Baocai]Beijing Univ Technol, Beijing Inst Artificial Intelligence, Beijing 100124, Peoples R China
[ 3 ] [Wei, Kun]Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
[ 4 ] [Liu, Xiuping]Dalian Univ Technol, Sch Math Sci, Dalian 116024, Peoples R China
[ 5 ] [Li, Xin]Texas A&M Univ, Sch Performance Visualizat & Fine Arts, Sect Visual Comp & Creat Media, College Stn, TX 77843 USA

Reprint Author's Address：

Email：

tanhongchenphd@bjut.edu.cn |
ybc@bjut.edu.cn |
weikunsk@gmail.com |
xpliu@dlut.edu.cn |
xinli@tamu.edu

Show more details

Related Keywords：

Attention-Bridged Modal Interaction for Text-to-Image Generation
2024，IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
DR-GAN: Distribution Regularization for Text-to-Image Generation
2022，IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
A Two-Stage Multi-Target Domain Adaptation Framework for Prediction of Key Performance Indicators Based on Adversarial Network
2024，IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE
BWGAN-GP: An EEG Data Generation Method for Class Imbalance Problem in RSVP Tasks
2022，IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING

Source ：

IEEE TRANSACTIONS ON MULTIMEDIA

ISSN： 1520-9210

Year： 2023

Volume： 25

Page： 8620-8631

7 . 3 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Web of Science

Type
Departments

All Years Choose Year From to