Llms Machine Learning Ai Safety Ai Infrastructure Ai Agents

[2508.13415] MAVIS: Multi-Objective Alignment via Inference-Time Value-Guided Selection

arXiv - Machine Learning February 17, 2026 4 min read Article

Summary

The paper introduces MAVIS, a framework for aligning large language models (LLMs) to multiple objectives at inference time, enhancing flexibility and efficiency in model behavior control.

Why It Matters

As LLMs are increasingly used in applications requiring nuanced outputs, MAVIS provides a solution to the limitations of traditional fine-tuning methods. This approach allows for dynamic adjustments to model behavior, making it more adaptable to user preferences without the need for extensive retraining.

Key Takeaways

MAVIS enables multi-objective alignment without modifying model weights.
The framework uses value models to adjust output distributions based on user-defined preferences.
Empirical results show MAVIS outperforms traditional fine-tuning methods in achieving better trade-offs.

Computer Science > Machine Learning arXiv:2508.13415 (cs) [Submitted on 19 Aug 2025 (v1), last revised 14 Feb 2026 (this version, v3)] Title:MAVIS: Multi-Objective Alignment via Inference-Time Value-Guided Selection Authors:Jeremy Carleton, Debajoy Mukherjee, Srinivas Shakkottai, Dileep Kalathil View a PDF of the paper titled MAVIS: Multi-Objective Alignment via Inference-Time Value-Guided Selection, by Jeremy Carleton and 3 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) are increasingly deployed across diverse applications that demand balancing multiple, often conflicting, objectives -- such as helpfulness, harmlessness, or humor. Many traditional methods for aligning outputs to user-specific preferences require fine-tuning models for each objective or for specific preference configurations, which is computationally expensive and inflexible. We introduce \textbf{MAVIS} -- \textit{Multi-Objective Alignment via Inference-Time Value-Guided Selection} -- a lightweight inference-time alignment framework that enables dynamic control over LLM behavior without modifying the base model's weights. MAVIS trains a set of small value models, each corresponding to a distinct objective. At inference time, these value models are combined using user-specified weights to produce a tilting function that adjusts the base model's output distribution toward desired trade-offs. The value models are trained using a simple iterative algorithm that enables monoton...

Read Original Article