[2602.11320] Efficient Analysis of the Distilled Neural Tangent Kernel
Summary
The paper presents a novel approach to reduce the computational complexity of Neural Tangent Kernel (NTK) methods through dataset distillation, achieving significant efficiency gains in kernel computation.
Why It Matters
This research addresses the computational limitations of NTK methods, which are crucial for understanding neural networks. By introducing the distilled neural tangent kernel (DNTK), the authors provide a method that enhances efficiency while maintaining predictive performance, which is vital for advancing machine learning applications.
Key Takeaways
- DNTK achieves a 20-100x reduction in Jacobian calculations.
- Dataset distillation can effectively compress data dimensions for NTK computation.
- Per-class NTK matrices retain low effective rank, aiding in computational efficiency.
- The proposed method combines dataset distillation with advanced projection techniques.
- Significant reductions in computational complexity can enhance the scalability of neural network analyses.
Computer Science > Machine Learning arXiv:2602.11320 (cs) [Submitted on 11 Feb 2026 (v1), last revised 17 Feb 2026 (this version, v2)] Title:Efficient Analysis of the Distilled Neural Tangent Kernel Authors:Jamie Mahowald, Brian Bell, Alex Ho, Michael Geyer View a PDF of the paper titled Efficient Analysis of the Distilled Neural Tangent Kernel, by Jamie Mahowald and 3 other authors View PDF HTML (experimental) Abstract:Neural tangent kernel (NTK) methods are computationally limited by the need to evaluate large Jacobians across many data points. Existing approaches reduce this cost primarily through projecting and sketching the Jacobian. We show that NTK computation can also be reduced by compressing the data dimension itself using NTK-tuned dataset distillation. We demonstrate that the neural tangent space spanned by the input data can be induced by dataset distillation, yielding a 20-100$\times$ reduction in required Jacobian calculations. We further show that per-class NTK matrices have low effective rank that is preserved by this reduction. Building on these insights, we propose the distilled neural tangent kernel (DNTK), which combines NTK-tuned dataset distillation with state-of-the-art projection methods to reduce up NTK computational complexity by up to five orders of magnitude while preserving kernel structure and predictive performance. Comments: Subjects: Machine Learning (cs.LG) Cite as: arXiv:2602.11320 [cs.LG] (or arXiv:2602.11320v2 [cs.LG] for this versio...