[2603.03741] HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization

[2603.03741] HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration

arXiv - AI March 05, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.03741: HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration

Computer Science > Robotics arXiv:2603.03741 (cs) [Submitted on 4 Mar 2026] Title:HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration Authors:Hao Zhang, Yaru Niu, Yikai Wang, Ding Zhao, H. Eric Tseng View a PDF of the paper titled HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration, by Hao Zhang and 4 other authors View PDF HTML (experimental) Abstract:To improve generalization and resilience in human-robot collaboration (HRC), robots must handle the combinatorial diversity of human behaviors and contexts, motivating multi-agent reinforcement learning (MARL). However, inherent heterogeneity between robots and humans creates a rationality gap (RG) in the learning process-a variational mismatch between decentralized best-response dynamics and centralized cooperative ascent. The resulting learning problem is a general-sum differentiable game, so independent policy-gradient updates can oscillate or diverge without added structure. We propose heterogeneous-agent Lyapunov policy optimization (HALyPO), which establishes formal stability directly in the policy-parameter space by enforcing a per-step Lyapunov decrease condition on a parameter-space disagreement metric. Unlike Lyapunov-based safe RL, which targets state/trajectory constraints in constrained Markov decision processes, HALyPO uses Lyapunov certification to stabilize decentralized policy learning. HALyPO rectifies decentralized gradients via optimal ...

Originally published on March 05, 2026. Curated by AI News.

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min · about 4 hours ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 1 day ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 2 days ago

Robotics

What Cities Need To Consider Before Allowing Self-Driving Cars

submitted by /u/timemagazine [link] [comments]

Reddit - Artificial Intelligence · 1 min · 2 days ago

[2603.03741] HALyPO: Heterogeneous-Agent Lyapunov Policy Optimization for Human-Robot Collaboration

About this article

Related Articles

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

HALO - Hierarchical Autonomous Learning Organism

HALO - Hierarchical Autonomous Learning Organism

What Cities Need To Consider Before Allowing Self-Driving Cars

No comments

Stay updated with AI News