[2511.14827] Implicit Bias of the JKO Scheme

arXiv - AI March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2511.14827: Implicit Bias of the JKO Scheme

Statistics > Machine Learning arXiv:2511.14827 (stat) [Submitted on 18 Nov 2025 (v1), last revised 3 Mar 2026 (this version, v3)] Title:Implicit Bias of the JKO Scheme Authors:Peter Halmos, Boris Hanin View a PDF of the paper titled Implicit Bias of the JKO Scheme, by Peter Halmos and Boris Hanin View PDF HTML (experimental) Abstract:Wasserstein gradient flow provides a general framework for minimizing an energy functional $J$ over the space of probability measures on a Riemannian manifold $(M,g)$. Its canonical time-discretization, the Jordan-Kinderlehrer-Otto (JKO) scheme, produces for any step size $\eta>0$ a sequence of probability distributions $\rho_k^\eta$ that approximate to first order in $\eta$ Wasserstein gradient flow on $J$. But the JKO scheme also has many other remarkable properties not shared by other first order integrators, e.g. it preserves energy dissipation and exhibits unconditional stability for $\lambda$-geodesically convex functionals $J$. To better understand the JKO scheme we characterize its implicit bias at second order in $\eta$. We show that $\rho_k^\eta$ are approximated to order $\eta^2$ by Wasserstein gradient flow on a modified energy \[ J^{\eta}(\rho) = J(\rho) - \frac{\eta}{4}\int_M \Big\lVert \nabla_g \frac{\delta J}{\delta \rho} (\rho) \Big\rVert_{2}^{2} \,\rho(dx), \] obtained by subtracting from $J$ the squared metric curvature of $J$ times $\eta/4$. The JKO scheme therefore adds at second order in $\eta$ a deceleration in direction...

Originally published on March 05, 2026. Curated by AI News.

Machine Learning

[R] I trained a 3k parameter model on XOR sequences of length 20. It extrapolates perfectly to length 1,000,000. Here's why I think that's architecturally significant.

I've been working on an alternative to attention-based sequence modeling that I'm calling Geometric Flow Networks (GFN). The core idea: i...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before trai...

Reddit - Machine Learning · 1 min · about 7 hours ago

Ai Safety

I’ve come up with a new thought experiment to approach ASI, and it challenges the very notions of alignment and containment

I’ve written an essay exploring what I’m calling the Super-Intelligent Octopus Problem—a thought experiment designed to surface a paradox...

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

Ai Safety

Bias in AI: Examples and 6 Ways to Fix it in 2026

AI bias is an anomaly in the output of ML algorithms due to prejudiced assumptions. Explore types of AI bias, examples, how to reduce bia...

AI Events · 36 min · about 17 hours ago

[2511.14827] Implicit Bias of the JKO Scheme

About this article

Related Articles

[R] I trained a 3k parameter model on XOR sequences of length 20. It extrapolates perfectly to length 1,000,000. Here's why I think that's architecturally significant.

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

I’ve come up with a new thought experiment to approach ASI, and it challenges the very notions of alignment and containment

Bias in AI: Examples and 6 Ways to Fix it in 2026

No comments

Stay updated with AI News