[2603.01455] From Verbatim to Gist: Distilling Pyramidal Multimodal

[2603.01455] From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

arXiv - AI March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.01455: From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.01455 (cs) [Submitted on 2 Mar 2026] Title:From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents Authors:Niu Lian, Yuting Wang, Hanshu Yao, Jinpeng Wang, Bin Chen, Yaowei Wang, Min Zhang, Shu-Tao Xia View a PDF of the paper titled From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents, by Niu Lian and 7 other authors View PDF HTML (experimental) Abstract:While multimodal large language models have demonstrated impressive short-term reasoning, they struggle with long-horizon video understanding due to limited context windows and static memory mechanisms that fail to mirror human cognitive efficiency. Existing paradigms typically fall into two extremes: vision-centric methods that incur high latency and redundancy through dense visual accumulation, or text-centric approaches that suffer from detail loss and hallucination via aggressive captioning. To bridge this gap, we propose MM-Mem, a pyramidal multimodal memory architecture grounded in Fuzzy-Trace Theory. MM-Mem structures memory hierarchically into a Sensory Buffer, Episodic Stream, and Symbolic Schema, enabling the progressive distillation of fine-grained perceptual traces (verbatim) into high-level semantic schemas (gist). Furthermore, to govern the dynamic construction of memory, we derive a Semantic Inf...

Originally published on March 03, 2026. Curated by AI News.

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General · about 1 hour ago

Llms

How is mythos mythos ? [D]

Hello, I’ve been seeing discussions about “Mythos AI” showing behaviors that seem far beyond simple text prediction—like accessing inform...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

Claude developer hosts Christian leaders for AI summit

AI Tools & Products · about 6 hours ago

Llms

CoreWeave stock pops 11% on deal to power Anthropic's Claude

AI Tools & Products · 3 min · about 6 hours ago

[2603.01455] From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

About this article

Related Articles

8 free AI courses from Anthropic’s Claude platform with certificates

How is mythos mythos ? [D]

Claude developer hosts Christian leaders for AI summit

CoreWeave stock pops 11% on deal to power Anthropic's Claude

No comments

Stay updated with AI News