[2603.05296] Latent Policy Steering through One-Step Flow Policies

arXiv - Machine Learning March 06, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.05296: Latent Policy Steering through One-Step Flow Policies

Computer Science > Robotics arXiv:2603.05296 (cs) [Submitted on 5 Mar 2026] Title:Latent Policy Steering through One-Step Flow Policies Authors:Hokyun Im, Andrey Kolobov, Jianlong Fu, Youngwoon Lee View a PDF of the paper titled Latent Policy Steering through One-Step Flow Policies, by Hokyun Im and 3 other authors View PDF HTML (experimental) Abstract:Offline reinforcement learning (RL) allows robots to learn from offline datasets without risky exploration. Yet, offline RL's performance often hinges on a brittle trade-off between (1) return maximization, which can push policies outside the dataset support, and (2) behavioral constraints, which typically require sensitive hyperparameter tuning. Latent steering offers a structural way to stay within the dataset support during RL, but existing offline adaptations commonly approximate action values using latent-space critics learned via indirect distillation, which can lose information and hinder convergence. We propose Latent Policy Steering (LPS), which enables high-fidelity latent policy improvement by backpropagating original-action-space Q-gradients through a differentiable one-step MeanFlow policy to update a latent-action-space actor. By eliminating proxy latent critics, LPS allows an original-action-space critic to guide end-to-end latent-space optimization, while the one-step MeanFlow policy serves as a behavior-constrained generative prior. This decoupling yields a robust method that works out-of-the-box with minima...

Originally published on March 06, 2026. Curated by AI News.

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 1 day ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 1 day ago

Robotics

What Cities Need To Consider Before Allowing Self-Driving Cars

submitted by /u/timemagazine [link] [comments]

Reddit - Artificial Intelligence · 1 min · 2 days ago

Robotics

AI system learns to prevent warehouse robot traffic jams, boosting throughput 25%

"Inside a giant autonomous warehouse, hundreds of robots dart down aisles as they collect and distribute items to fulfill a steady stream...

Reddit - Artificial Intelligence · 1 min · 2 days ago

[2603.05296] Latent Policy Steering through One-Step Flow Policies

About this article

Related Articles

HALO - Hierarchical Autonomous Learning Organism

HALO - Hierarchical Autonomous Learning Organism

What Cities Need To Consider Before Allowing Self-Driving Cars

AI system learns to prevent warehouse robot traffic jams, boosting throughput 25%

No comments

Stay updated with AI News