Indexed by:
Abstract:
Although generative adversarial networks are commonly used in textto-image generation tasks and have made great progress, there are still some problems. The convolution operation used in these GANs-based methods works on local regions, but not disjoint regions of the image, leading to structural anomalies in the generated image. Moreover, the semantic consistency of generated images and corresponding text descriptions still needs to be improved. In this paper, we propose a multi-attention generative adversarial networks (MAGAN) for text-toimage generation. We use self-attention mechanism to improve the overall quality of images, so that the target image with a certain structure can also be generated well. We use multi-head attentionmechanism to improve the semantic consistency of generated images and text descriptions. We conducted extensive experiments on three datasets: Oxford-102 Flowers dataset, Caltech-UCSD Birds dataset and COCO dataset. Our MAGAN has better results than representative methods such as AttnGAN, MirrorGAN and ControlGAN.
Keyword:
Reprint Author's Address:
Email:
Source :
PATTERN RECOGNITION AND COMPUTER VISION, PT IV
ISSN: 0302-9743
Year: 2021
Volume: 13022
Page: 312-322
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 10
Affiliated Colleges: