• Complex
  • Title
  • Keyword
  • Abstract
  • Scholars
  • Journal
  • ISSN
  • Conference
搜索

Author:

Li, Jiafeng (Li, Jiafeng.) | Sun, Shengyao (Sun, Shengyao.) | Zhang, Kang (Zhang, Kang.) | Zhang, Jing (Zhang, Jing.) (Scholars:张菁) | Zhuo, Li (Zhuo, Li.)

Indexed by:

EI Scopus SCIE

Abstract:

The detection of unknown objects is a challenging task in computer vision because, although there are diverse real-world detection object categories, existing object-detection training sets cover a limited number of object categories . Most existing approaches use two-stage networks to improve a model's ability to characterize objects of unknown classes, which leads to slow inference. To address this issue, we proposed a single-stage unknown object detection method based on the contrastive language-image pre-training (CLIP) model and pseudo-labelling, called CLIP-YOLO. First, a visual language embedding alignment method is introduced and a channel-grouped enhanced coordinate attention module is embedded into a YOLO-series detection head and feature-enhancing component, to improve the model's ability to characterize and detect unknown category objects. Second, the pseudo-labelling generation is optimized based on the CLIP model to expand the diversity of the training set and enhance the ability to cover unknown object categories. We validated this method on four challenging datasets: MSCOCO, ILSVRC, Visual Genome, and PASCAL VOC. The results show that our method can achieve higher accuracy and faster speed, so as to obtain better performance of unknown object detection. The source code is available at https://github.com/BJUTsipl/CLIP-YOLO.

Keyword:

Single-stage Pseudo-labeling Zero-shot detection CLIP

Author Community:

  • [ 1 ] [Li, Jiafeng]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 2 ] [Sun, Shengyao]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 3 ] [Zhang, Kang]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 4 ] [Zhang, Jing]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 5 ] [Zhuo, Li]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
  • [ 6 ] [Li, Jiafeng]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
  • [ 7 ] [Sun, Shengyao]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
  • [ 8 ] [Zhang, Kang]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
  • [ 9 ] [Zhang, Jing]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China
  • [ 10 ] [Zhuo, Li]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China

Reprint Author's Address:

  • [Li, Jiafeng]Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China;;[Li, Jiafeng]Beijing Univ Technol, Beijing Key Lab Computat Intelligence & Intelligen, Beijing 100124, Peoples R China;;

Show more details

Related Keywords:

Source :

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

ISSN: 1868-8071

Year: 2024

Issue: 2

Volume: 16

Page: 1055-1070

5 . 6 0 0

JCR@2022

Cited Count:

WoS CC Cited Count: 3

SCOPUS Cited Count: 2

ESI Highly Cited Papers on the List: 0 Unfold All

WanFang Cited Count:

Chinese Cited Count:

30 Days PV: 22

Affiliated Colleges:

Online/Total:483/10580205
Address:BJUT Library(100 Pingleyuan,Chaoyang District,Beijing 100124, China Post Code:100124) Contact Us:010-67392185
Copyright:BJUT Library Technical Support:Beijing Aegean Software Co., Ltd.