[2505.11846] Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks

[2505.11846] Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks

arXiv - Machine Learning 4 min read Article

Summary

This paper investigates the identifiability and singularity of polynomial neural networks, focusing on MLPs and CNNs, and explores their geometric properties using algebraic geometry.

Why It Matters

Understanding the identifiability of neural networks is crucial for improving model interpretability and performance. This research sheds light on the structure of neural networks, potentially guiding future developments in machine learning and AI.

Key Takeaways

  • Identifiability issues in MLPs lead to finitely many parameter choices for most functions.
  • CNNs demonstrate a generic one-to-one parameterization.
  • Singular points in neuromanifolds arise from sparse subnetworks, impacting model performance.
  • The geometric explanation of MLPs' sparsity bias is linked to critical points of mean-squared error loss.
  • Algebraic geometry tools are essential for analyzing these neural network structures.

Computer Science > Machine Learning arXiv:2505.11846 (cs) [Submitted on 17 May 2025 (v1), last revised 13 Feb 2026 (this version, v2)] Title:Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks Authors:Vahid Shahverdi, Giovanni Luca Marchetti, Kathlén Kohn View a PDF of the paper titled Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks, by Vahid Shahverdi and 2 other authors View PDF HTML (experimental) Abstract:We study function spaces parametrized by neural networks, referred to as neuromanifolds. Specifically, we focus on deep Multi-Layer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs) with an activation function that is a sufficiently generic polynomial. First, we address the identifiability problem, showing that, for almost all functions in the neuromanifold of an MLP, there exist only finitely many parameter choices yielding that function. For CNNs, the parametrization is generically one-to-one. As a consequence, we compute the dimension of the neuromanifold. Second, we describe singular points of neuromanifolds. We characterize singularities completely for CNNs, and partially for MLPs. In both cases, they arise from sparse subnetworks. For MLPs, we prove that these singularities often correspond to critical points of the mean-squared error loss, which does not hold for CNNs. This provides a geometric explanation of the sparsity bias of MLPs. All of our results leverage tools fro...

Related Articles

Machine Learning

Finally Abliterated Sarvam 30B and 105B!

I abliterated Sarvam-30B and 105B - India's first multilingual MoE reasoning models - and found something interesting along the way! Reas...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

BANKING77-77: New best of 94.61% on the official test set (+0.13pp) over our previous tests 94.48%.

Hi everyone, Just wanted to share a small but hard-won milestone. After a long plateau at 94.48%, we’ve pushed the official BANKING77-77 ...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Free tool I built to score dataset quality (LQS) — feedback welcome [D]

We built a Label Quality Score (LQS) system for our dataset marketplace and opened it up as a free standalone tool. Upload a dataset → ge...

Reddit - Machine Learning · 1 min ·
Meta’s New AI Model Gives Mark Zuckerberg a Seat at the Big Kid’s Table | WIRED
Machine Learning

Meta’s New AI Model Gives Mark Zuckerberg a Seat at the Big Kid’s Table | WIRED

Muse Spark is Meta’s first model since its AI reboot, and the benchmarks suggest formidable performance.

Wired - AI · 6 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime