[2603.20351] MANA: Towards Efficient Mobile Ad Detection via Multimodal Agentic UI Navigation
About this article
Abstract page for arXiv paper 2603.20351: MANA: Towards Efficient Mobile Ad Detection via Multimodal Agentic UI Navigation
Computer Science > Cryptography and Security arXiv:2603.20351 (cs) [Submitted on 20 Mar 2026] Title:MANA: Towards Efficient Mobile Ad Detection via Multimodal Agentic UI Navigation Authors:Yizhe Zhao, Yongjian Fu, Zihao Feng, Hao Pan, Yongheng Deng, Yaoxue Zhang, Ju Ren View a PDF of the paper titled MANA: Towards Efficient Mobile Ad Detection via Multimodal Agentic UI Navigation, by Yizhe Zhao and 6 other authors View PDF HTML (experimental) Abstract:Mobile advertising dominates app monetization but introduces risks ranging from intrusive user experience to malware delivery. Existing detection methods rely either on static analysis, which misses runtime behaviors, or on heuristic UI exploration, which struggles with sparse and obfuscated ads. In this paper, we present MANA, the first agentic multimodal reasoning framework for mobile ad detection. MANA integrates static, visual, temporal, and experiential signals into a reasoning-guided navigation strategy that determines not only how to traverse interfaces but also where to focus, enabling efficient and robust exploration. We implement and evaluate MANA on commercial smartphones over 200 apps, achieving state-of-the-art accuracy and efficiency. Compared to baselines, it improves detection accuracy by 30.5%-56.3% and reduces exploration steps by 29.7%-63.3%. Case studies further demonstrate its ability to uncover obfuscated and malicious ads, underscoring its practicality for mobile ad auditing and its potential for broade...