Indexed by:
Abstract:
Scene graph generation (SGG) aims to perceive objects and their relations in images, which can bridge the gap between upstream detection tasks and downstream high-level visual understanding tasks. For SGG models, over-fitting head predicates can lead to bias in the generated scene graph, which has become a consensus. A series of debiasing methods have been proposed to solve the problem. However, some existing debiasing SGG methods have a tendency to over-fit tail predicates, which is another type of bias. In order to eliminate the one-way over-fitting of head or tail predicates, this article proposes a balanced relation prediction (BRP) module which is model-agnostic and compatible with existing re-balancing methods. Moreover, because the relation prediction is based on object feature representation, this article proposes a scene adaptive context fusion (SACF) module to refine the object feature representation. Specifically, SACF models the context based on a chain structure, where the order of objects in the chain structure is adaptively arranged according to the scene content, achieving visual information fusion that adapts to the scene where the objects are located. Experiments on VG and GQA datasets show that the proposed method achieves competitive results on the comprehensive metric of R@K and mR@K.
Keyword:
Reprint Author's Address:
Email:
Source :
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
ISSN: 1551-6857
Year: 2025
Issue: 3
Volume: 21
5 . 1 0 0
JCR@2022
Cited Count:
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 4
Affiliated Colleges: