[2603.01455] From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

[2603.01455] From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.01455: From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.01455 (cs) [Submitted on 2 Mar 2026] Title:From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents Authors:Niu Lian, Yuting Wang, Hanshu Yao, Jinpeng Wang, Bin Chen, Yaowei Wang, Min Zhang, Shu-Tao Xia View a PDF of the paper titled From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents, by Niu Lian and 7 other authors View PDF HTML (experimental) Abstract:While multimodal large language models have demonstrated impressive short-term reasoning, they struggle with long-horizon video understanding due to limited context windows and static memory mechanisms that fail to mirror human cognitive efficiency. Existing paradigms typically fall into two extremes: vision-centric methods that incur high latency and redundancy through dense visual accumulation, or text-centric approaches that suffer from detail loss and hallucination via aggressive captioning. To bridge this gap, we propose MM-Mem, a pyramidal multimodal memory architecture grounded in Fuzzy-Trace Theory. MM-Mem structures memory hierarchically into a Sensory Buffer, Episodic Stream, and Symbolic Schema, enabling the progressive distillation of fine-grained perceptual traces (verbatim) into high-level semantic schemas (gist). Furthermore, to govern the dynamic construction of memory, we derive a Semantic Inf...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General ·
Llms

How is mythos mythos ? [D]

Hello, I’ve been seeing discussions about “Mythos AI” showing behaviors that seem far beyond simple text prediction—like accessing inform...

Reddit - Machine Learning · 1 min ·
Llms

Claude developer hosts Christian leaders for AI summit

AI Tools & Products ·
CoreWeave stock pops 11% on deal to power Anthropic's Claude
Llms

CoreWeave stock pops 11% on deal to power Anthropic's Claude

AI Tools & Products · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime