[2603.01270] VoxKnesset: A Large-Scale Longitudinal Hebrew Speech

[2603.01270] VoxKnesset: A Large-Scale Longitudinal Hebrew Speech Dataset for Aging Speaker Modeling

arXiv - Machine Learning March 03, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.01270: VoxKnesset: A Large-Scale Longitudinal Hebrew Speech Dataset for Aging Speaker Modeling

Electrical Engineering and Systems Science > Audio and Speech Processing arXiv:2603.01270 (eess) [Submitted on 1 Mar 2026] Title:VoxKnesset: A Large-Scale Longitudinal Hebrew Speech Dataset for Aging Speaker Modeling Authors:Yanir Marmor, Arad Zulti, David Krongauz, Adam Gabet, Yoad Snapir, Yair Lifshitz, Eran Segal View a PDF of the paper titled VoxKnesset: A Large-Scale Longitudinal Hebrew Speech Dataset for Aging Speaker Modeling, by Yanir Marmor and 6 other authors View PDF HTML (experimental) Abstract:Speech processing systems face a fundamental challenge: the human voice changes with age, yet few datasets support rigorous longitudinal evaluation. We introduce VoxKnesset, an open-access dataset of ~2,300 hours of Hebrew parliamentary speech spanning 2009-2025, comprising 393 speakers with recording spans of up to 15 years. Each segment includes aligned transcripts and verified demographic metadata from official parliamentary records. We benchmark modern speech embeddings (WavLM-Large, ECAPA-TDNN, Wav2Vec2-XLSR-1B) on age prediction and speaker verification under longitudinal conditions. Speaker verification EER rises from 2.15\% to 4.58\% over 15 years for the strongest model, and cross-sectionally trained age regressors fail to capture within-speaker aging, while longitudinally trained models recover a meaningful temporal signal. We publicly release the dataset and pipeline to support aging-robust speech systems and Hebrew speech processing. Comments: Subjects: Audio...

Originally published on March 03, 2026. Curated by AI News.

Llms

CLI for Google AI Search (gai.google) — run AI-powered code/tech searches headlessly from your terminal

Google AI (gai.google) gives Gemini-powered answers for technical queries — think AI-enhanced search with code understanding. I built a C...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Machine Learning

Big increase in the amount of people using AI to write their replies with AI

I find it interesting that we’ve all randomly decided to use the “-“ more often recently on reddit, and everyone’s grammar has drasticall...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Machine Learning

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...

Reddit - Machine Learning · 1 min · about 4 hours ago

Machine Learning

IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat

News News: The Continuing Education Programme (CEP) at IIT Delhi has announced the launch of the 8th batch of its Advanced Certificate Pr...

AI News - General · 9 min · about 4 hours ago

[2603.01270] VoxKnesset: A Large-Scale Longitudinal Hebrew Speech Dataset for Aging Speaker Modeling

About this article

Related Articles

CLI for Google AI Search (gai.google) — run AI-powered code/tech searches headlessly from your terminal

Big increase in the amount of people using AI to write their replies with AI

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat

No comments

Stay updated with AI News