[2602.13670] Advancing Analytic Class-Incremental Learning through Vision-Language Calibration
Summary
This article presents VILA, a novel framework for class-incremental learning that utilizes vision-language calibration to enhance efficiency and stability in machine learning models.
Why It Matters
The study addresses critical challenges in class-incremental learning, particularly the trade-off between adaptation and stability. By introducing VILA, it offers a solution that could improve the performance of machine learning models in dynamic environments, which is essential for real-world applications.
Key Takeaways
- VILA framework enhances analytic class-incremental learning.
- Utilizes a dual-branch approach for vision-language calibration.
- Addresses representation rigidity as a major bottleneck.
- Demonstrates superior performance across multiple benchmarks.
- Maintains efficiency while improving prediction accuracy.
Computer Science > Machine Learning arXiv:2602.13670 (cs) [Submitted on 14 Feb 2026] Title:Advancing Analytic Class-Incremental Learning through Vision-Language Calibration Authors:Binyu Zhao, Wei Zhang, Xingrui Yu, Zhaonian Zou, Ivor Tsang View a PDF of the paper titled Advancing Analytic Class-Incremental Learning through Vision-Language Calibration, by Binyu Zhao and 4 other authors View PDF Abstract:Class-incremental learning (CIL) with pre-trained models (PTMs) faces a critical trade-off between efficient adaptation and long-term stability. While analytic learning enables rapid, recursive closed-form updates, its efficacy is often compromised by accumulated errors and feature incompatibility. In this paper, we first conduct a systematic study to dissect the failure modes of PTM-based analytic CIL, identifying representation rigidity as the primary bottleneck. Motivated by these insights, we propose \textbf{VILA}, a novel dual-branch framework that advances analytic CIL via a two-level vision-language calibration strategy. Specifically, we coherently fuse plastic, task-adapted features with a frozen, universal semantic anchor at the feature level through geometric calibration, and leverage cross-modal priors at the decision level to rectify prediction bias. This confluence maintains analytic-learning's extreme efficiency while overcoming its inherent brittleness. Extensive experiments across eight benchmarks demonstrate that VILA consistently yields superior performanc...