[2602.16072] Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research
Summary
The Omni-iEEG dataset provides a comprehensive resource for epilepsy research, featuring 302 patients and 178 hours of high-resolution iEEG recordings, facilitating improved localization of seizure onset zones and clinical outcomes.
Why It Matters
This dataset addresses significant barriers in epilepsy research by offering a standardized, large-scale resource that enhances reproducibility and cross-center validation. It supports the development of machine learning models that can improve clinical workflows and patient outcomes in epilepsy treatment.
Key Takeaways
- Omni-iEEG includes 302 patients and 178 hours of high-resolution iEEG recordings.
- The dataset harmonizes clinical metadata and provides over 36K expert-validated annotations.
- It establishes a standardized benchmark for evaluating machine learning models in epilepsy research.
- The resource enables systematic evaluation of clinically relevant tasks grounded in clinical priors.
- Omni-iEEG enhances the transferability of models pretrained on non-neurophysiological domains.
Computer Science > Machine Learning arXiv:2602.16072 (cs) [Submitted on 17 Feb 2026] Title:Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research Authors:Chenda Duan, Yipeng Zhang, Sotaro Kanai, Yuanyi Ding, Atsuro Daida, Pengyue Yu, Tiancheng Zheng, Naoto Kuroda, Shaun A. Hussain, Eishi Asano, Hiroki Nariai, Vwani Roychowdhury View a PDF of the paper titled Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research, by Chenda Duan and 11 other authors View PDF HTML (experimental) Abstract:Epilepsy affects over 50 million people worldwide, and one-third of patients suffer drug-resistant seizures where surgery offers the best chance of seizure freedom. Accurate localization of the epileptogenic zone (EZ) relies on intracranial EEG (iEEG). Clinical workflows, however, remain constrained by labor-intensive manual review. At the same time, existing data-driven approaches are typically developed on single-center datasets that are inconsistent in format and metadata, lack standardized benchmarks, and rarely release pathological event annotations, creating barriers to reproducibility, cross-center validation, and clinical relevance. With extensive efforts to reconcile heterogeneous iEEG formats, metadata, and recordings across publicly available sources, we present $\textbf{Omni-iEEG}$, a large-scale, pre-surgical iEEG resource comprising $\textbf{302 patients}$ and $\textbf{178 hours}$ of high-resolution recordings....