CMAE:Channel-Masked Autoencoders
Authors:
Yutao Wang, Yu Nie, and Yilai Zhang
Conference:
ICIC 2025 Posters, Ningbo, China, July 26-29, 2025
Pages:
516-532
Keywords:
Computer vision, Self-supervised learning, Visual feature learning, Transfer learning
Abstract
In the realm of self-supervised learning, excellent frameworks such as MAE and MoCo have emerged. However, these frameworks' complexity and reliance on specific architectures limit their universality and scalability across different models. Although these training paradigms have facilitated the performance improvement of lightweight models to a certain extent, related research remains scarce. Therefore, this paper aims to explore new approaches for enhancing the performance of lightweight models and proposes a more universal, concise, and scalable self-supervised learning framework called Channel-Masked Autoencoders CMAE . CMAE effectively addresses the incompatibility issue between the MAE framework and con1volutional neural networks and can be well applied to lightweight models. Additionally, we have further investigated the impact of noise strategies on the performance of lightweight models and applied them to CMAE. Our method is concise and efficient: the encoder learns latent representations from grayscale images obtained by randomly masking two color channels and approximately 50 random cropping, which provides information for the decoder to reconstruct the original image. This innovative idea stems from the fact that human vision relies primarily on texture and shape features rather than color. We conducted experiments on multiple datasets and tasks to evaluate the universality and generalization capabilities of the model comprehensively. In these experiments, CMAE exhibited remarkable performance, particularly noteworthy being that the MobileViTv3 model pre-trained with CMAE achieved a 3.7 percentage point improvement in classification accuracy on the Mini-ImageNet dataset. Furthermore, CMAE also demonstrated advantages compared to MoCov3.
BibTeX Citation:
@inproceedings{ICIC2025,
author = {Yutao Wang, Yu Nie, and Yilai Zhang},
title = {CMAE:Channel-Masked Autoencoders},
booktitle = {Proceedings of the 21st International Conference on Intelligent Computing (ICIC 2025)},
month = {July},
date = {26-29},
year = {2025},
address = {Ningbo, China},
pages = {516-532},
doi = {
10.65286/icic.v21i4.77236}
}