Ai Agents Ai Startups Generative Ai Machine Learning

[2602.13318] DECKBench: Benchmarking Multi-Agent Frameworks for Academic Slide Generation and Editing

arXiv - Machine Learning February 17, 2026 4 min read Article

Summary

DECKBench introduces a new evaluation framework for multi-agent systems focused on generating and editing academic slide decks, addressing gaps in existing benchmarks.

Why It Matters

This research is significant as it provides a standardized method for evaluating multi-agent frameworks in academic settings, enhancing the quality and effectiveness of automated slide generation and editing. It aims to improve the fidelity and coherence of academic presentations, which is crucial for effective communication in research.

Key Takeaways

DECKBench offers a comprehensive evaluation framework for academic slide generation and editing.
The framework assesses fidelity, coherence, layout quality, and instruction following.
It includes a modular multi-agent baseline system to streamline the slide creation process.
The benchmark highlights both strengths and weaknesses in existing systems.
Publicly available code and data promote reproducibility and further research.

Computer Science > Artificial Intelligence arXiv:2602.13318 (cs) [Submitted on 10 Feb 2026] Title:DECKBench: Benchmarking Multi-Agent Frameworks for Academic Slide Generation and Editing Authors:Daesik Jang, Morgan Lindsay Heisler, Linzi Xing, Yifei Li, Edward Wang, Ying Xiong, Yong Zhang, Zhenan Fan View a PDF of the paper titled DECKBench: Benchmarking Multi-Agent Frameworks for Academic Slide Generation and Editing, by Daesik Jang and 7 other authors View PDF HTML (experimental) Abstract:Automatically generating and iteratively editing academic slide decks requires more than document summarization. It demands faithful content selection, coherent slide organization, layout-aware rendering, and robust multi-turn instruction following. However, existing benchmarks and evaluation protocols do not adequately measure these challenges. To address this gap, we introduce the Deck Edits and Compliance Kit Benchmark (DECKBench), an evaluation framework for multi-agent slide generation and editing. DECKBench is built on a curated dataset of paper to slide pairs augmented with realistic, simulated editing instructions. Our evaluation protocol systematically assesses slide-level and deck-level fidelity, coherence, layout quality, and multi-turn instruction following. We further implement a modular multi-agent baseline system that decomposes the slide generation and editing task into paper parsing and summarization, slide planning, HTML creation, and iterative editing. Experimental re...

Read Original Article

[2602.13318] DECKBench: Benchmarking Multi-Agent Frameworks for Academic Slide Generation and Editing

Summary

Why It Matters

Key Takeaways

Related Articles

we just hit 555 stars on our open source AI agent config tool and i'm honestly still in shock

[P] Cadenza: Connect Wandb logs to agents easily for autonomous research.

[P] Cadenza: Connect Wandb logs to agents easily for autonomous research.

Nvidia goes all-in on AI agents while Anthropic pulls the plug

No comments

Stay updated with AI News