[2604.00200] Offline Constrained RLHF with Multiple Preference Oracles

arXiv - Machine Learning April 02, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.00200: Offline Constrained RLHF with Multiple Preference Oracles

Computer Science > Machine Learning arXiv:2604.00200 (cs) [Submitted on 31 Mar 2026] Title:Offline Constrained RLHF with Multiple Preference Oracles Authors:Brenden Latham, Mehrdad Moharrami View a PDF of the paper titled Offline Constrained RLHF with Multiple Preference Oracles, by Brenden Latham and Mehrdad Moharrami View PDF HTML (experimental) Abstract:We study offline constrained reinforcement learning from human feedback with multiple preference oracles. Motivated by applications that trade off performance with safety or fairness, we aim to maximize target population utility subject to a minimum protected group welfare constraint. From pairwise comparisons collected under a reference policy, we estimate oracle-specific rewards via maximum likelihood and analyze how statistical uncertainty propagates through the dual program. We cast the constrained objective as a KL-regularized Lagrangian whose primal optimizer is a Gibbs policy, reducing learning to a convex dual problem. We propose a dual-only algorithm that ensures high-probability constraint satisfaction and provide the first finite-sample performance guarantees for offline constrained preference learning. Finally, we extend our theoretical analysis to accommodate multiple constraints and general f-divergence regularization. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2604.00200 [cs.LG] (or arXiv:2604.00200v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2604.00200 Focus to learn more arXiv-i...

Originally published on April 02, 2026. Curated by AI News.

Llms

The public needs to control AI-run infrastructure, labor, education, and governance— NOT private actors

A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only qu...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Ai Safety

China drafts law regulating 'digital humans' and banning addictive virtual services for children

A Reuters report outlines China's proposed regulations on the rapidly expanding sector of digital humans and AI avatars. Under the new dr...

Reddit - Artificial Intelligence · 1 min · about 15 hours ago

Generative Ai

[2512.00408] Low-Bitrate Video Compression through Semantic-Conditioned Diffusion

Abstract page for arXiv paper 2512.00408: Low-Bitrate Video Compression through Semantic-Conditioned Diffusion

arXiv - AI · 3 min · about 16 hours ago

Llms

[2510.15148] XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

Abstract page for arXiv paper 2510.15148: XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

arXiv - AI · 4 min · about 16 hours ago

[2604.00200] Offline Constrained RLHF with Multiple Preference Oracles

About this article

Related Articles

The public needs to control AI-run infrastructure, labor, education, and governance— NOT private actors

China drafts law regulating 'digital humans' and banning addictive virtual services for children

[2512.00408] Low-Bitrate Video Compression through Semantic-Conditioned Diffusion

[2510.15148] XModBench: Benchmarking Cross-Modal Capabilities and Consistency in Omni-Language Models

No comments

Stay updated with AI News