[2511.20888] Deep Learning as a Convex Paradigm of Computation:

[2511.20888] Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets

arXiv - Machine Learning March 26, 2026 4 min read

About this article

Abstract page for arXiv paper 2511.20888: Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets

Statistics > Machine Learning arXiv:2511.20888 (stat) [Submitted on 25 Nov 2025 (v1), last revised 25 Mar 2026 (this version, v2)] Title:Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets Authors:Arthur Jacot View a PDF of the paper titled Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets, by Arthur Jacot View PDF HTML (experimental) Abstract:This paper argues that DNNs implement a computational Occam's razor -- finding the `simplest' algorithm that fits the data -- and that this could explain their incredible and wide-ranging success over more traditional statistical methods. We start with the discovery that the set of real-valued function $f$ that can be $\epsilon$-approximated with a binary circuit of size at most $c\epsilon^{-\gamma}$ becomes convex in the `Harder than Monte Carlo' (HTMC) regime, when $\gamma>2$, allowing for the definition of a HTMC norm on functions. In parallel one can define a complexity measure on the parameters of a ResNets (a weighted $\ell_1$ norm of the parameters), which induce a `ResNet norm' on functions. The HTMC and ResNet norms can then be related by an almost matching sandwich bound. Thus minimizing this ResNet norm is equivalent to finding a circuit that fits the data with an almost minimal number of nodes (within a power of 2 of being optimal). ResNets thus appear as an alternative model for computation of real functions, better adapted to the HTMC regime and i...

Originally published on March 26, 2026. Curated by AI News.

Llms

[2603.18940] Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

Abstract page for arXiv paper 2603.18940: Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty ...

arXiv - Machine Learning · 3 min · 30 minutes ago

Machine Learning

[2512.20620] Uncovering Patterns of Brain Activity from EEG Data Consistently Associated with Cybersickness Using Neural Network Interpretability Maps

Abstract page for arXiv paper 2512.20620: Uncovering Patterns of Brain Activity from EEG Data Consistently Associated with Cybersickness ...

arXiv - Machine Learning · 4 min · 30 minutes ago

Machine Learning

[2512.13607] Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Abstract page for arXiv paper 2512.13607: Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

arXiv - Machine Learning · 4 min · 30 minutes ago

Machine Learning

[2512.02650] Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

Abstract page for arXiv paper 2512.02650: Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

arXiv - Machine Learning · 3 min · 30 minutes ago

[2511.20888] Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets

About this article

Related Articles

[2603.18940] Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

[2512.20620] Uncovering Patterns of Brain Activity from EEG Data Consistently Associated with Cybersickness Using Neural Network Interpretability Maps

[2512.13607] Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

[2512.02650] Hear What Matters! Text-conditioned Selective Video-to-Audio Generation

No comments

Stay updated with AI News