Indexed by:
Abstract:
YOLO (You Only Look Once) of target detection algorithms have complex model structure and large computation. However, in practical application scenarios, they not Only need to meet the task requirements of low latency and low power consumption, but also face problems such as difficult deployment and long development cycle of YOLO algorithms. Based on the flexibility of embedded CPU software design and the advantages of Field Programmable Gate Array (FPGA) parallel computing, combined with the structural characteristics of YOLO algorithm, a universal hardware accelerator software/hardware co-design method for rapid deployment of YOLO algorithm is proposed to solve the above problems. In the hardware acceleration design, multi-channel parallel internal and external storage interaction, model parameter reordering, fixed-point, multi-dimensional parallelism tiling and other acceleration optimization techniques are adopted. In the software driver design, multiple versions of YOLO algorithm are compatible to achieve rapid deployment. Zynq7000 is used as the hardware platform to implement a YOLO algorithm universal hardware accelerator system with low delay and low power consumption. The results show that the energy efficiency ratio of this system can be improved by 132x compared with PC CPU and 120x compared with embedded CPU while implementing YOLOv4-tiny algorithm. © 2022 SPIE.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 0277-786X
Year: 2022
Volume: 12254
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 11
Affiliated Colleges: