AQRG: adaptive quantization reconstruction granularity for post-training quantization - Details

Author：

Zhang, W. (Zhang, W..) | Wang, T. (Wang, T..) | Fu, G. (Fu, G..) | Bao, Z. (Bao, Z..)

Indexed by：

Scopus

Abstract：

Deploying　models　on　resource-constrained　edge　devices　remains　always　a　critical　challenge　for　the　application　of　neural　network.　Quantization　is　one　of　the　most　popular　methods　to　compress　the　model　for　meeting　the　performance　limitations.　As　only　a　small　amount　of　calibration　data　is　required,　post-training　quantization　(PTQ)　is　more　suitable　for　protecting　privacy　than　quantization-aware　training(QAT).　However,　PTQ　often　causes　substantial　accuracy　degradation　when　it　goes　below　4-bit,　and　previous　PTQ　works　primarily　focused　on　single　reconstruction　quantization　granularity,　either　all　layer-wise　or　all　block-wise.　Nevertheless,　it　is　proved　in　our　exploratory　experiments　that　these　schemes　are　sub-optimal.　In　this　paper,　we　explore　the　relation　of　Hessian　matrix　trace　and　the　inter-layer　dependency　which　takes　key　role　in　the　choice　of　quantization　reconstruction　granularity.　Based　on　the　discovery,　we　propose　a　novel　hybrid　reconstruction　granularity　quantization　scheme　AQRG,　which　adaptively　adjusts　quantization　granularity　guided　by　the　Hessian　matrix　trace.　In　image　classification　and　object　detection,　AQRG　achieves　better　accuracy　and　robustness　for　calibration　data　size　on　several　typical　convolutional　neural　networks.　In　particular,　our　4-bit　weight　2-bit　activation　(W4A2)　scheme　in　ResNet-18　achieved　65.06%　accuracy　on　the　ImageNet　dataset.　©　The　Author(s),　under　exclusive　licence　to　Springer-Verlag　London　Ltd.,　part　of　Springer　Nature　2025.

Keyword：

Artificial intelligence Convolutional neural network Fine-tuning Model compression Post-training quantization

Author Community：

[ 1 ] [Zhang W.]College of Computer Science, Beijing University of Technology, Beijing, China
[ 2 ] [Wang T.]College of Computer Science, Beijing University of Technology, Beijing, China
[ 3 ] [Fu G.]College of Computer Science, Beijing University of Technology, Beijing, China
[ 4 ] [Bao Z.]College of Computer Science, Beijing University of Technology, Beijing, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

A deep learning-based novel hybrid CNN-LSTM architecture for efficient detection of threats in the IoT ecosystem
2024，AIN SHAMS ENGINEERING JOURNAL
A Generative Approach for Script Event Prediction via Contrastive Fine-Tuning
2023，37th AAAI Conference on Artificial Intelligence, AAAI 2023
FlowGuard: An Intelligent Edge Defense Mechanism Against IoT DDoS Attacks
2020，IEEE INTERNET OF THINGS JOURNAL
On the Design of Federated Learning in the Mobile Edge Computing Systems
2021，IEEE Transactions on Communications

Source ：

Neural Computing and Applications

ISSN： 0941-0643

Year： 2025

6 . 0 0 0

JCR@2022

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 1

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search SCOPUS

Type
Departments

All Years Choose Year From to