[2603.21621] Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective

[2603.21621] Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.21621: Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective

Computer Science > Machine Learning arXiv:2603.21621 (cs) [Submitted on 23 Mar 2026] Title:Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective Authors:Yuehu Gong, Zeyuan Wang, Yulin Chen, Yanwei Fu View a PDF of the paper titled Proximal Policy Optimization in Path Space: A Schr\"odinger Bridge Perspective, by Yuehu Gong and 3 other authors View PDF HTML (experimental) Abstract:On-policy reinforcement learning with generative policies is promising but remains underexplored. A central challenge is that proximal policy optimization (PPO) is traditionally formulated in terms of action-space probability ratios, whereas diffusion- and flow-based policies are more naturally represented as trajectory-level generative processes. In this work, we propose GSB-PPO, a path-space formulation of generative PPO inspired by the Generalized Schrödinger Bridge (GSB). Our framework lifts PPO-style proximal updates from terminal actions to full generation trajectories, yielding a unified view of on-policy optimization for generative policies. Within this framework, we develop two concrete objectives: a clipping-based objective, GSB-PPO-Clip, and a penalty-based objective, GSB-PPO-Penalty. Experimental results show that while both objectives are compatible with on-policy training, the penalty formulation consistently delivers better stability and performance than the clipping counterpart. Overall, our results highlight path-space proximal regularization as an effective...

Originally published on March 24, 2026. Curated by AI News.

Related Articles

VCs are betting billions on AI's next wave, so why is OpenAI killing Sora? | TechCrunch
Generative Ai

VCs are betting billions on AI's next wave, so why is OpenAI killing Sora? | TechCrunch

Equity breaks down why OpenAI pulled the plug on Sora, what Meta’s back-to-back legal losses mean, and more of the week's headlines.

TechCrunch - AI · 4 min ·
OpenAI shuts down Sora while Meta gets shut out in court | TechCrunch
Generative Ai

OpenAI shuts down Sora while Meta gets shut out in court | TechCrunch

Watch as Equity asks why OpenAI shut down Sora just months after launch, what Meta’s back-to-back legal losses mean, and more of the week...

TechCrunch - AI · 3 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
[2603.14294] Seeking Physics in Diffusion Noise
Machine Learning

[2603.14294] Seeking Physics in Diffusion Noise

Abstract page for arXiv paper 2603.14294: Seeking Physics in Diffusion Noise

arXiv - Machine Learning · 3 min ·
More in Generative Ai: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime