[2603.23436] Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning
About this article
Abstract page for arXiv paper 2603.23436: Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning
Computer Science > Machine Learning arXiv:2603.23436 (cs) [Submitted on 24 Mar 2026] Title:Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning Authors:Connor Mclaughlin, Nigel Lee, Lili Su View a PDF of the paper titled Similarity-Aware Mixture-of-Experts for Data-Efficient Continual Learning, by Connor Mclaughlin and 2 other authors View PDF HTML (experimental) Abstract:Machine learning models often need to adapt to new data after deployment due to structured or unstructured real-world dynamics. The Continual Learning (CL) framework enables continuous model adaptation, but most existing approaches either assume each task contains sufficiently many data samples or that the learning tasks are non-overlapping. In this paper, we address the more general setting where each task may have a limited dataset, and tasks may overlap in an arbitrary manner without a priori knowledge. This general setting is substantially more challenging for two reasons. On the one hand, data scarcity necessitates effective contextualization of general knowledge and efficient knowledge transfer across tasks. On the other hand, unstructured task overlapping can easily result in negative knowledge transfer. To address the above challenges, we propose an adaptive mixture-of-experts (MoE) framework over pre-trained models that progressively establishes similarity awareness among tasks. Our design contains two innovative algorithmic components: incremental global pooling and instance...