Papers
arxiv:2511.11716

CompressNAS : A Fast and Efficient Technique for Model Compression using Decomposition

Published on Nov 12, 2025
Authors:
,
,

Abstract

CompressNAS enables global rank selection for low-rank tensor decomposition in CNNs, achieving significant model compression with minimal accuracy loss on ImageNet and COCO datasets.

AI-generated summary

Deep Convolutional Neural Networks (CNNs) are increasingly difficult to deploy on microcontrollers (MCUs) and lightweight NPUs (Neural Processing Units) due to their growing size and compute demands. Low-rank tensor decomposition, such as Tucker factorization, is a promising way to reduce parameters and operations with reasonable accuracy loss. However, existing approaches select ranks locally and often ignore global trade-offs between compression and accuracy. We introduce CompressNAS, a MicroNAS-inspired framework that treats rank selection as a global search problem. CompressNAS employs a fast accuracy estimator to evaluate candidate decompositions, enabling efficient yet exhaustive rank exploration under memory and accuracy constraints. In ImageNet, CompressNAS compresses ResNet-18 by 8x with less than 4% accuracy drop; on COCO, we achieve 2x compression of YOLOv5s without any accuracy drop and 2x compression of YOLOv5n with a 2.5% drop. Finally, we present a new family of compressed models, STResNet, with competitive performance compared to other efficient models.

Community

Sign up or log in to comment

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2511.11716 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2511.11716 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.