[2603.24916] Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML
About this article
Abstract page for arXiv paper 2603.24916: Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML
Computer Science > Machine Learning arXiv:2603.24916 (cs) [Submitted on 26 Mar 2026] Title:Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML Authors:Yassien Shaalan View a PDF of the paper titled Once-for-All Channel Mixers (HYPERTINYPW): Generative Compression for TinyML, by Yassien Shaalan View PDF HTML (experimental) Abstract:Deploying neural networks on microcontrollers is constrained by kilobytes of flash and SRAM, where 1x1 pointwise (PW) mixers often dominate memory even after INT8 quantization across vision, audio, and wearable sensing. We present HYPER-TINYPW, a compression-as-generation approach that replaces most stored PW weights with generated weights: a shared micro-MLP synthesizes PW kernels once at load time from tiny per-layer codes, caches them, and executes them with standard integer operators. This preserves commodity MCU runtimes and adds only a one-off synthesis cost; steady-state latency and energy match INT8 separable CNN baselines. Enforcing a shared latent basis across layers removes cross-layer redundancy, while keeping PW1 in INT8 stabilizes early, morphology-sensitive mixing. We contribute (i) TinyML-faithful packed-byte accounting covering generator, heads/factorization, codes, kept PW1, and backbone; (ii) a unified evaluation with validation-tuned t* and bootstrap confidence intervals; and (iii) a deployability analysis covering integer-only inference and boot versus lazy synthesis. On three ECG benchmarks (Apnea-EC...