[2511.22235] Training High-Level Schedulers with Execution-Feedback

[2511.22235] Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

arXiv - AI March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2511.22235: Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

Computer Science > Artificial Intelligence arXiv:2511.22235 (cs) [Submitted on 27 Nov 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation Authors:Zehao Deng, Tianjie Ju, Zheng Wu, Zhuosheng Zhang, Gongshen Liu View a PDF of the paper titled Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation, by Zehao Deng and 4 other authors View PDF HTML (experimental) Abstract:The rapid development of large vision-language model (VLM) has greatly promoted the research of GUI agent. However, GUI agents still face significant challenges in handling long-horizon tasks. First, single-agent models struggle to balance high-level capabilities and low-level execution capability, facing prevalent issues of responsibility coupling and capability conflicts. Second, agents lack awareness of the task state, leading to progress loss in long-horizon tasks. To address these challenges, we propose a staged execution-feedback reinforcement learning algorithm. Unlike training a unified policy model, we focus on training high-level scheduling models. Specifically, we propose and train two agents: a Coordinator, responsible for the strategic planning and task decomposition; and a State Tracker, responsible for context compression and information management to maintain the task's state and coherence. Based on this, we built the Coordi...

Originally published on March 05, 2026. Curated by AI News.

Llms

Florida's attorney general launches probe into Open AI, Chat GPT

AI Tools & Products · 1 min · about 1 hour ago

Llms

The Gemini app can now generate interactive simulations and models.

AI Tools & Products · 1 min · about 1 hour ago

Llms

AI on the couch: Anthropic gives Claude 20 hours of psychiatry

AI Tools & Products · 6 min · about 1 hour ago

Llms

Moody’s Integrates AI Agents With Anthropic’s Claude

AI Tools & Products · 4 min · about 1 hour ago

[2511.22235] Training High-Level Schedulers with Execution-Feedback Reinforcement Learning for Long-Horizon GUI Automation

About this article

Related Articles

Florida's attorney general launches probe into Open AI, Chat GPT

The Gemini app can now generate interactive simulations and models.

AI on the couch: Anthropic gives Claude 20 hours of psychiatry

Moody’s Integrates AI Agents With Anthropic’s Claude

No comments

Stay updated with AI News