Indexed by:
Abstract:
Weakly supervised video object segmentation (WSVOS) is a vital yet challenging task in which the aim is to segment pixel-level masks with only category labels. Existing methods still have certain limitations, e.g., difficulty in comprehending appropriate spatiotemporal knowledge and an inability to explore common semantic information with category labels. To overcome these challenges, we formulate a novel framework by integrating multisource saliency and incorporating an exemplar mechanism for WSVOS. Specifically, we propose a multisource saliency module to comprehend spatiotemporal knowledge by integrating spatial and temporal saliency as bottom-up cues, which can effectively eliminate disruptions due to confusing regions and identify attractive regions. Moreover, to our knowledge, we make the first attempt to incorporate an exemplar mechanism into WSVOS by proposing an adaptive exemplar module to process top-down cues, which can provide reliable guidance for co-occurring objects in intraclass videos and identify attentive regions. Our framework, which comprises the two aforementioned modules, offers a new perspective on directly constructing the correspondence between bottom-up cues and top-down cues when ground-truth information for the reference frames is lacking. Comprehensive experiments demonstrate that the proposed framework achieves state-of-the-art performance.
Keyword:
Reprint Author's Address:
Source :
IEEE TRANSACTIONS ON IMAGE PROCESSING
ISSN: 1057-7149
Year: 2021
Volume: 30
Page: 8155-8169
1 0 . 6 0 0
JCR@2022
ESI Discipline: ENGINEERING;
ESI HC Threshold:87
JCR Journal Grade:1
Cited Count:
WoS CC Cited Count: 7
SCOPUS Cited Count: 8
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 9
Affiliated Colleges: