[2601.04786] AgentOCR: Reimagining Agent History via Optical Self-Compression
About this article
Abstract page for arXiv paper 2601.04786: AgentOCR: Reimagining Agent History via Optical Self-Compression
Computer Science > Machine Learning arXiv:2601.04786 (cs) [Submitted on 8 Jan 2026 (v1), last revised 28 Feb 2026 (this version, v2)] Title:AgentOCR: Reimagining Agent History via Optical Self-Compression Authors:Lang Feng, Fuchao Yang, Feng Chen, Xin Cheng, Haiyang Xu, Zhenglin Wan, Ming Yan, Bo An View a PDF of the paper titled AgentOCR: Reimagining Agent History via Optical Self-Compression, by Lang Feng and 7 other authors View PDF HTML (experimental) Abstract:Recent advances in large language models (LLMs) enable agentic systems trained with reinforcement learning (RL) over multi-turn interaction trajectories, but practical deployment is bottlenecked by rapidly growing textual histories that inflate token budgets and memory usage. We introduce AgentOCR, a framework that exploits the superior information density of visual tokens by representing the accumulated observation-action history as a compact rendered image. To make multi-turn rollouts scalable, AgentOCR proposes segment optical caching. By decomposing history into hashable segments and maintaining a visual cache, this mechanism eliminates redundant re-rendering. Beyond fixed rendering, AgentOCR introduces agentic self-compression, where the agent actively emits a compression rate and is trained with compression-aware reward to adaptively balance task success and token efficiency. We conduct extensive experiments on challenging agentic benchmarks, ALFWorld and search-based QA. Remarkably, results demonstrate tha...