Indexed by:
Abstract:
Deep learning is a powerful tool that uses simple representations to express complex ideas and allows computers to mine hidden information and value from experience. It has achieved great success in a variety of fields. However, with the development of deep learning models, the computing resources consumed by training the models increase rapidly, making it difficult to directly deploy the models to portable devices or embedded systems. Therefore, compressing and accelerating DNN models without sacrificing performance has become a crucial area of research in the field of deep learning. In many existing compression techniques, optimization theory and approaches play an important role in their research and implementation. In this paper, we focus on neural network compression from an optimization perspective and review related optimization strategies. Specifically, we summarize optimization techniques emerging from four general categories of commonly used network compression approaches, including network pruning, low-bit quantization, low-rank factorization, and knowledge distillation. Finally, we provide a summary and discuss some possible research directions. © 2023 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0) Peer-review under responsibility of the scientific committee of the Tenth International Conference on Information Technology and Quantitative Management.
Keyword:
Reprint Author's Address:
Email:
Source :
ISSN: 1877-0509
Year: 2023
Volume: 221
Page: 1351-1357
Language: English
Cited Count:
WoS CC Cited Count: 0
SCOPUS Cited Count: 2
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 7
Affiliated Colleges: