[2602.17049] IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents
Summary
The paper presents IntentCUA, a framework for multi-agent planning in computer-use agents, focusing on intent-level representations to enhance task success and efficiency in dynamic environments.
Why It Matters
As automation in desktop environments becomes increasingly complex, understanding user intent and improving execution stability is crucial. IntentCUA addresses these challenges by providing a structured approach to skill abstraction and multi-agent coordination, which can significantly enhance the reliability of automated systems.
Key Takeaways
- IntentCUA improves task success rates to 74.83% in desktop automation.
- The framework utilizes intent-aligned plan memory to enhance execution stability.
- Multi-view intent abstraction and shared memory are critical for reducing error propagation.
- Cooperative multi-agent coordination leads to significant gains in long-horizon tasks.
- The approach outperforms traditional RL-based and trajectory-centric methods.
Computer Science > Artificial Intelligence arXiv:2602.17049 (cs) [Submitted on 19 Feb 2026] Title:IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents Authors:Seoyoung Lee, Seobin Yoon, Seongbeen Lee, Yoojung Chun, Dayoung Park, Doyeon Kim, Joo Yong Sim View a PDF of the paper titled IntentCUA: Learning Intent-level Representations for Skill Abstraction and Multi-Agent Planning in Computer-Use Agents, by Seoyoung Lee and 6 other authors View PDF HTML (experimental) Abstract:Computer-use agents operate over long horizons under noisy perception, multi-window contexts, evolving environment states. Existing approaches, from RL-based planners to trajectory retrieval, often drift from user intent and repeatedly solve routine subproblems, leading to error accumulation and inefficiency. We present IntentCUA, a multi-agent computer-use framework designed to stabilize long-horizon execution through intent-aligned plan memory. A Planner, Plan-Optimizer, and Critic coordinate over shared memory that abstracts raw interaction traces into multi-view intent representations and reusable skills. At runtime, intent prototypes retrieve subgroup-aligned skills and inject them into partial plans, reducing redundant re-planning and mitigating error propagation across desktop applications. In end-to-end evaluations, IntentCUA achieved a 74.83% task success rate with a Step Efficiency Ratio of 0.91, outperforming RL-based and traje...