Indexed by:
Abstract:
The issue of language priors persists in existing Visual Question Answering (VQA) models, hindering their ability to generalize across diverse QA distributions. Traditional strategies for counterfactual sample synthesis, which aim to eliminate language bias by generating counterfactuals for all training samples, encounter two primary challenges: (1) Not every sample contributes to language bias; thus, indiscriminate counterfactual synthesis may introduce new biases and adversely affect the model learning process. (2) The counterfactuals of questions often lose significant information, failing to effectively heighten the model's sensitivity to key terms. In this paper, we introduce the Contrastive Visual-Question-Caption Counterfactuals model for Biased Samples in VQA tasks. This model integrates captions to augment visual information within the textual domain and constructs counterfactual samples exclusively for biased samples, thereby mitigating the negative impacts of language bias. Specifically, we employ a biased sample selection module to identify samples with language biases within the training set, considering that unbiased samples do not exacerbate the model's reliance on language patterns. To enrich the visual content in the textual domain, we synthesize caption-based counterfactual samples. To further enhance the effectiveness of counterfactual samples in improving the model's sensitivity, we develop a counterfactual contrast learning module. This module is designed to discern the relationship between visual and textual components within the same sample. Experimental results demonstrate that our proposed model not only is compatible with various VQA backbones but also significantly improves performance on the out-of-distribution dataset VQA CP v2. © 2024 Technical Committee on Control Theory, Chinese Association of Automation.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 1934-1768
Year: 2024
Page: 7603-7609
Language: English
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 10
Affiliated Colleges: