[2603.29522] Baby Scale: Investigating Models Trained on Individual Children's Language Input
About this article
Abstract page for arXiv paper 2603.29522: Baby Scale: Investigating Models Trained on Individual Children's Language Input
Computer Science > Computation and Language arXiv:2603.29522 (cs) [Submitted on 31 Mar 2026] Title:Baby Scale: Investigating Models Trained on Individual Children's Language Input Authors:Steven Y. Feng, Alvin W.M. Tan, Michael C. Frank View a PDF of the paper titled Baby Scale: Investigating Models Trained on Individual Children's Language Input, by Steven Y. Feng and 2 other authors View PDF HTML (experimental) Abstract:Modern language models (LMs) must be trained on many orders of magnitude more words of training data than human children receive before they begin to produce useful behavior. Assessing the nature and origins of this "data gap" requires benchmarking LMs on human-scale datasets to understand how linguistic knowledge emerges from children's natural training data. Using transcripts from the BabyView dataset (videos from children ages 6-36 months), we investigate (1) scaling performance at child-scale data regimes, (2) variability in model performance across datasets from different children's experiences and linguistic predictors of dataset quality, and (3) relationships between model and child language learning outcomes. LMs trained on child data show acceptable scaling for grammar tasks, but lower scaling on semantic and world knowledge tasks than models trained on synthetic data; we also observe substantial variability on data from different children. Beyond dataset size, performance is most associated with a combination of distributional and interactional ...