[2509.03417] Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study

[2509.03417] Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2509.03417: Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study

Computer Science > Machine Learning arXiv:2509.03417 (cs) [Submitted on 3 Sep 2025 (v1), last revised 30 Mar 2026 (this version, v3)] Title:Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study Authors:Spyros Rigas, Dhruv Verma, Georgios Alexandridis, Yixuan Wang View a PDF of the paper titled Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study, by Spyros Rigas and 3 other authors View PDF HTML (experimental) Abstract:Kolmogorov-Arnold Networks (KANs) are a recently introduced neural architecture that replace fixed nonlinearities with trainable activation functions, offering enhanced flexibility and interpretability. While KANs have been applied successfully across scientific and machine learning tasks, their initialization strategies remain largely unexplored. In this work, we study initialization schemes for spline-based KANs, proposing two theory-driven approaches inspired by LeCun and Glorot, as well as an empirical power-law family with tunable exponents. Our evaluation combines large-scale grid searches on function fitting and forward PDE benchmarks, an analysis of training dynamics through the lens of the Neural Tangent Kernel, and evaluations on a subset of the Feynman dataset. Our findings indicate that the Glorot-inspired initialization significantly outperforms the baseline in parameter-rich models, while power-law initialization achieves the strongest performance overall, both across tasks and for architectures of varyi...

Originally published on March 31, 2026. Curated by AI News.

Related Articles

Machine Learning

Is it actually possible to build a model-agnostic persistent text layer that keeps AI behavior stable?

Is it actually possible to define a persistent, model-agnostic text-based layer (loaded with the model each time) that keeps an AI system...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Hey everyone, I’m an AI news curator and editor currently working on a piece about a weird trend I’ve been spotting: technical simulators...

Reddit - Machine Learning · 1 min ·
Machine Learning

Coherence Without Convergence: A New Protocol for Multi-Agent AI

Opening For the past year, most progress in multi-agent AI has followed a familiar pattern: Add more agents. Add more coordination. Watch...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Week 6 AIPass update - answering the top questions from last post (file conflicts, remote models, scale)

Followup to last post with answers to the top questions from the comments. Appreciate everyone who jumped in. The most common one by a mi...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime