Indexed by:
Abstract:
Sentiment classification can explore the opinions expressed by people and help them make better decisions. With the increasing of multimodal contents on the web, such as text, image, audio and video, how to make full use of them is important in many tasks, including sentiment classification. This paper focuses on the text and image. Previous work cannot capture the fine-grained features of images, and those models bring a lot of noise during feature fusion. In this work, we propose a novel multimodal sentiment classification model based on gated attention mechanism. The image feature is used to emphasize the text segment by the attention mechanism and it allows the model to focus on the text that affects the sentiment polarity. Moreover, the gating mechanism enables the model to retain useful image information while ignoring the noise introduced during the fusion of image and text. The experiment results on Yelp multimodal dataset show that our model outperforms the previous SOTA model. And the ablation experiment results further prove the effectiveness of different strategies in the proposed model. (C) 2022 Elsevier B.V. All rights reserved.
Keyword:
Reprint Author's Address:
Source :
KNOWLEDGE-BASED SYSTEMS
ISSN: 0950-7051
Year: 2022
Volume: 240
8 . 8
JCR@2022
8 . 8 0 0
JCR@2022
ESI Discipline: COMPUTER SCIENCE;
ESI HC Threshold:46
JCR Journal Grade:1
CAS Journal Grade:2
Cited Count:
WoS CC Cited Count: 51
SCOPUS Cited Count: 67
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 23
Affiliated Colleges: