[2603.22339] Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits
About this article
Abstract page for arXiv paper 2603.22339: Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits
Computer Science > Machine Learning arXiv:2603.22339 (cs) [Submitted on 21 Mar 2026] Title:Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits Authors:Eric Czech, Zhiwei Xu, Yael Elmatad, Yixin Wang, William Held View a PDF of the paper titled Problems with Chinchilla Approach 2: Systematic Biases in IsoFLOP Parabola Fits, by Eric Czech and 4 other authors View PDF HTML (experimental) Abstract:Chinchilla Approach 2 is among the most widely used methods for fitting neural scaling laws. Its parabolic approximation introduces systematic biases in compute-optimal allocation estimates, even on noise-free synthetic data. Applied to published Llama 3 IsoFLOP data at open frontier compute scales, these biases imply a parameter underallocation corresponding to 6.5% of the $3.8\times10^{25}$ FLOP training budget and \$1.4M (90% CI: \$412K-\$2.9M) in unnecessary compute at 50% H100 MFU. Simulated multimodal model misallocations show even greater opportunity costs due to higher loss surface asymmetry. Three sources of this error are examined: IsoFLOP sampling grid width (Taylor approximation accuracy), uncentered IsoFLOP sampling, and loss surface asymmetry ($\alpha \neq \beta$). Chinchilla Approach 3 largely eliminates these biases but is often regarded as less data-efficient, numerically unstable, prone to local minima, and harder to implement. Each concern is shown to be unfounded or addressable, especially when the partially linear structure of the obje...