[2603.03535] Trade-offs in Ensembling, Merging and Routing Among

[2603.03535] Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts

arXiv - Machine Learning March 05, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.03535: Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts

Computer Science > Machine Learning arXiv:2603.03535 (cs) [Submitted on 3 Mar 2026] Title:Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts Authors:Sanae Lotfi, Lucas Caccia, Alessandro Sordoni, Jordan T. Ash, Miroslav Dudik View a PDF of the paper titled Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts, by Sanae Lotfi and 4 other authors View PDF HTML (experimental) Abstract:While large language models (LLMs) fine-tuned with lightweight adapters achieve strong performance across diverse tasks, their performance on individual tasks depends on the fine-tuning strategy. Fusing independently trained models with different strengths has shown promise for multi-task learning through three main strategies: ensembling, which combines outputs from independent models; merging, which fuses model weights via parameter averaging; and routing, which integrates models in an input-dependent fashion. However, many design decisions in these approaches remain understudied, and the relative benefits of more sophisticated ensembling, merging and routing techniques are not fully understood. We empirically evaluate their trade-offs, addressing two key questions: What are the advantages of going beyond uniform ensembling or merging? And does the flexibility of routing justify its complexity? Our findings indicate that non-uniform ensembling and merging improve performance, but routing offers even greater gains. To mitigate the computati...

Originally published on March 05, 2026. Curated by AI News.

Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

A study found that sycophancy is pervasive among chatbots, and that bots are more likely than human peers to affirm a person's bad behavior.

AI Tools & Products · 6 min · about 1 hour ago

Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min · about 3 hours ago

Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.03535] Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts

About this article

Related Articles

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

No comments

Stay updated with AI News