GSE-MN4: Group-Shared Exponents Integer Quantization for MobileNetV4

Abstract

This paper introduces Group-Shared Exponents Integer Quantization for MobileNetV4, a novel quantization framework tailored for efficient deployment of deep learning models on resource-constrained edge devices. Our method employs the Group-Shared Exponents GSE format, which shares exponents among groups of parameters and quantizes mantissas under a shared exponent constraint, significantly reducing memory overhead compared to traditional quantization techniques. Furthermore, we introduce an automated mixed-precision quantization scheme that allocates bit-widths based on layer sensitivity, thereby assigning each layer an optimal quantization bit-width. This strategy effectively optimizes the trade-off between accuracy and efficiency. Extensive experiments on the ImageNet1K dataset demonstrate that GSE-MN4 outperforms conventional quantization methods. For instance, the GSE-MIX quantization method on MNv4-Conv-S achieves a Top-1 accuracy of 73.34 with a memory footprint of only 3.23 MB, maintaining high accuracy while substantially reducing memory usage. Our work highlights the potential of GSE-INT for efficient and accurate deployment of deep learning models in mobile and edge scenarios.

BibTeX Citation:

@inproceedings{ICIC2025,
    author = {Shenglin Yang, Zhuo Han, Hahn Yuan, and Yanmei Hu},
    title = {GSE-MN4: Group-Shared Exponents Integer Quantization for MobileNetV4},
    booktitle = {Proceedings of the 21st International Conference on Intelligent Computing (ICIC 2025)},
    month = {July},
    date = {26-29},
    year = {2025},
    address = {Ningbo, China},
    pages = {340-352},
    note = {Poster Volume Ⅰ}
    doi = {10.65286/icic.v21i1.87901}
}