[2602.16855] Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

[2602.16855] Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents

arXiv - AI 4 min read Article

Summary

The paper presents Mobile-Agent-v3.5, a multi-platform GUI agent model, showcasing its advanced capabilities in GUI automation and real-time interaction across various environments.

Why It Matters

This research is significant as it addresses the growing need for efficient and versatile GUI agents that can operate seamlessly across multiple platforms. The innovations introduced in the model, particularly in data collection and reasoning capabilities, could enhance user experience and automation in software applications.

Key Takeaways

  • Mobile-Agent-v3.5 introduces GUI-Owl-1.5, a state-of-the-art GUI agent model.
  • The model supports multiple platforms, enhancing cloud-edge collaboration.
  • Innovations include a Hybrid Data Flywheel for improved data collection efficiency.
  • A new reinforcement learning algorithm, MRPO, addresses multi-platform training challenges.
  • The models are open-sourced, promoting accessibility and further research.

Computer Science > Artificial Intelligence arXiv:2602.16855 (cs) [Submitted on 15 Feb 2026] Title:Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents Authors:Haiyang Xu, Xi Zhang, Haowei Liu, Junyang Wang, Zhaozai Zhu, Shengjie Zhou, Xuhao Hu, Feiyu Gao, Junjie Cao, Zihua Wang, Zhiyuan Chen, Jitong Liao, Qi Zheng, Jiahui Zeng, Ze Xu, Shuai Bai, Junyang Lin, Jingren Zhou, Ming Yan View a PDF of the paper titled Mobile-Agent-v3.5: Multi-platform Fundamental GUI Agents, by Haiyang Xu and 18 other authors View PDF HTML (experimental) Abstract:The paper introduces GUI-Owl-1.5, the latest native GUI agent model that features instruct/thinking variants in multiple sizes (2B/4B/8B/32B/235B) and supports a range of platforms (desktop, mobile, browser, and more) to enable cloud-edge collaboration and real-time interaction. GUI-Owl-1.5 achieves state-of-the-art results on more than 20+ GUI benchmarks on open-source models: (1) on GUI automation tasks, it obtains 56.5 on OSWorld, 71.6 on AndroidWorld, and 48.4 on WebArena; (2) on grounding tasks, it obtains 80.3 on ScreenSpotPro; (3) on tool-calling tasks, it obtains 47.6 on OSWorld-MCP, and 46.8 on MobileWorld; (4) on memory and knowledge tasks, it obtains 75.5 on GUI-Knowledge Bench. GUI-Owl-1.5 incorporates several key innovations: (1) Hybird Data Flywheel: we construct the data pipeline for UI understanding and trajectory generation based on a combination of simulated environments and cloud-based sandbox environments, in orde...

Related Articles

Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min ·
Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min ·
AI benchmarks are broken. Here’s what we need instead. | MIT Technology Review
Machine Learning

AI benchmarks are broken. Here’s what we need instead. | MIT Technology Review

One-off tests don’t measure AI’s true impact. We’re better off shifting to more human-centered, context-specific methods.

MIT Technology Review · 8 min ·
Machine Learning

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

[D] Ive been trying to understand the technical setup of a project called Qubic. It claims to use distributed proof of work computing for...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime