[2602.12641] Artic: AI-oriented Real-time Communication for MLLM Video Assistant

[2602.12641] Artic: AI-oriented Real-time Communication for MLLM Video Assistant

arXiv - AI 4 min read Article

Summary

The paper presents Artic, an AI-oriented real-time communication framework designed for Multimodal Large Language Model (MLLM) video assistants, addressing latency and accuracy issues in current systems.

Why It Matters

As AI video assistants become more prevalent, optimizing real-time communication is crucial for enhancing user experience. Artic's innovative approaches promise to improve interaction quality and efficiency, making it relevant for developers and researchers in AI and networking.

Key Takeaways

  • Artic improves real-time communication for MLLM video assistants.
  • Introduces a Response Capability-aware Adaptive Bitrate to manage bandwidth effectively.
  • Features Zero-overhead Context-aware Streaming to prioritize important video regions.
  • Establishes a Degraded Video Understanding Benchmark for evaluating MLLM accuracy.
  • Prototype tests show significant improvements in accuracy and latency.

Computer Science > Networking and Internet Architecture arXiv:2602.12641 (cs) [Submitted on 13 Feb 2026] Title:Artic: AI-oriented Real-time Communication for MLLM Video Assistant Authors:Jiangkai Wu, Zhiyuan Ren, Junquan Zhong, Liming Liu, Xinggong Zhang View a PDF of the paper titled Artic: AI-oriented Real-time Communication for MLLM Video Assistant, by Jiangkai Wu and 4 other authors View PDF HTML (experimental) Abstract:AI Video Assistant emerges as a new paradigm for Real-time Communication (RTC), where one peer is a Multimodal Large Language Model (MLLM) deployed in the cloud. This makes interaction between humans and AI more intuitive, akin to chatting with a real person. However, a fundamental mismatch exists between current RTC frameworks and AI Video Assistants, stemming from the drastic shift in Quality of Experience (QoE) and more challenging networks. Measurements on our production prototype also confirm that current RTC fails, causing latency spikes and accuracy drops. To address these challenges, we propose Artic, an AI-oriented RTC framework for MLLM Video Assistants, exploring the shift from "humans watching video" to "AI understanding video." Specifically, Artic proposes: (1) Response Capability-aware Adaptive Bitrate, which utilizes MLLM accuracy saturation to proactively cap bitrate, reserving bandwidth headroom to absorb future fluctuations for latency reduction; (2) Zero-overhead Context-aware Streaming, which allocates limited bitrate to regions most...

Related Articles

At the HumanX conference, everyone was talking about Claude | TechCrunch
Llms

At the HumanX conference, everyone was talking about Claude | TechCrunch

Anthropic was the star of the show at San Francisco's AI-centric conference.

TechCrunch - AI · 6 min ·
From LLMs to hallucinations, here's a simple guide to common AI terms | TechCrunch
Llms

From LLMs to hallucinations, here's a simple guide to common AI terms | TechCrunch

The rise of AI has brought an avalanche of new terms and slang. Here is a glossary with definitions of some of the most important words a...

TechCrunch - AI · 19 min ·
Llms

Gary Marcus on the Claude Code leak [D]

Gary Marcus just tweeted: ... the way Anthropic built that kernel is straight out of classical symbolic AI. For example, it is in large p...

Reddit - Machine Learning · 1 min ·
Llms

LLMs learn backwards, and the scaling hypothesis is bounded. [D]

submitted by /u/preyneyv [link] [comments]

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime