GSE-MN4: Group-Shared Exponents Integer Quantization for MobileNetV4

Authors: Shenglin Yang, Zhuo Han, Hahn Yuan, and Yanmei Hu
Conference: ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages: 340-352
Keywords: Post-Training Quantization, MobileNetV4

Abstract

This paper introduces Group-Shared Exponents Integer Quantization for MobileNetV4, a novel quantization framework tailored for efficient deployment of deep learning models on resource-constrained edge devices. Our method employs the Group-Shared Exponents GSE format, which shares exponents among groups of parameters and quantizes mantissas under a shared exponent constraint, significantly reducing memory overhead compared to traditional quantization techniques. Furthermore, we introduce an automated mixed-precision quantization scheme that allocates bit-widths based on layer sensitivity, thereby assigning each layer an optimal quantization bit-width. This strategy effectively optimizes the trade-off between accuracy and efficiency. Extensive experiments on the ImageNet1K dataset demonstrate that GSE-MN4 outperforms conventional quantization methods. For instance, the GSE-MIX quantization method on MNv4-Conv-S achieves a Top-1 accuracy of 73.34 with a memory footprint of only 3.23 MB, maintaining high accuracy while substantially reducing memory usage. Our work highlights the potential of GSE-INT for efficient and accurate deployment of deep learning models in mobile and edge scenarios.
📄 View Full Paper (PDF) 📋 Show Citation