[2509.24207] Humanline: Online Alignment as Perceptual Loss

arXiv - AI March 30, 2026 4 min read

About this article

Abstract page for arXiv paper 2509.24207: Humanline: Online Alignment as Perceptual Loss

Computer Science > Artificial Intelligence arXiv:2509.24207 (cs) [Submitted on 29 Sep 2025 (v1), last revised 27 Mar 2026 (this version, v2)] Title:Humanline: Online Alignment as Perceptual Loss Authors:Sijia Liu, Niklas Muennighoff, Kawin Ethayarajh View a PDF of the paper titled Humanline: Online Alignment as Perceptual Loss, by Sijia Liu and 2 other authors View PDF HTML (experimental) Abstract:Online alignment (e.g., GRPO) is generally more performant than offline alignment (e.g., DPO) -- but why? Drawing on prospect theory from behavioral economics, we propose a human-centric explanation. We prove that online on-policy sampling better approximates the human-perceived distribution of what the model can produce, and PPO/GRPO-style clipping -- originally introduced to just stabilize training -- recovers a perceptual bias in how humans perceive probability. In this sense, PPO/GRPO act as perceptual losses already. Our theory further suggests that the online/offline dichotomy is itself incidental to maximizing human utility, since we can achieve the same effect by selectively training on any data in a manner that mimics human perception, rather than restricting ourselves to online on-policy data. Doing so would allow us to post-train more quickly, cheaply, and flexibly without sacrificing performance. To this end, we propose a design pattern that explicitly incorporates perceptual distortions of probability into objectives like DPO/KTO/GRPO, creating humanline variants of ...

Originally published on March 30, 2026. Curated by AI News.

Ai Startups

20+ Best AI Project Ideas for 2026: Trending AI Projects

This article presents over 20 AI project ideas tailored for various skill levels, providing a roadmap for building portfolio-ready projec...

AI Events · 8 minutes ago

Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min · 8 minutes ago

Machine Learning

[P] Looking for people who have had training runs fail unexpectedly to beta test a stability monitor. Free, takes 5 minutes to add to your existing loop. DM me.

Anyone actively training models want to try a stability monitor on a real run? Trying to get real world validation outside my own benchma...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

Is the Mirage Effect a bug, or is it Geometric Reconstruction in action? A framework for why VLMs perform better "hallucinating" than guessing, and what that may tell us about what's really inside these models

Last week, a team from Stanford and UCSF (Asadi, O'Sullivan, Fei-Fei Li, Euan Ashley et al.) dropped two companion papers. The first, MAR...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

[2509.24207] Humanline: Online Alignment as Perceptual Loss

About this article

Related Articles

20+ Best AI Project Ideas for 2026: Trending AI Projects

Top 10 AI certifications and courses for 2026

[P] Looking for people who have had training runs fail unexpectedly to beta test a stability monitor. Free, takes 5 minutes to add to your existing loop. DM me.

Is the Mirage Effect a bug, or is it Geometric Reconstruction in action? A framework for why VLMs perform better "hallucinating" than guessing, and what that may tell us about what's really inside these models

No comments

Stay updated with AI News