[2603.03401] Beyond Cross-Validation: Adaptive Parameter Selection for

[2603.03401] Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents

arXiv - Machine Learning March 05, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.03401: Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents

Statistics > Machine Learning arXiv:2603.03401 (stat) [Submitted on 3 Mar 2026] Title:Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents Authors:Xiaotong Liu, Yunwen Lei, Xiangyu Chang, Shao-Bo Lin View a PDF of the paper titled Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents, by Xiaotong Liu and 3 other authors View PDF HTML (experimental) Abstract:This paper proposes a novel parameter selection strategy for kernel-based gradient descent (KGD) algorithms, integrating bias-variance analysis with the splitting method. We introduce the concept of empirical effective dimension to quantify iteration increments in KGD, deriving an adaptive parameter selection strategy that is implementable. Theoretical verifications are provided within the framework of learning theory. Utilizing the recently developed integral operator approach, we rigorously demonstrate that KGD, equipped with the proposed adaptive parameter selection strategy, achieves the optimal generalization error bound and adapts effectively to different kernels, target functions, and error metrics. Consequently, this strategy showcases significant advantages over existing parameter selection methods for KGD. Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Methodology (stat.ME) Cite as: arXiv:2603.03401 [stat.ML] (or arXiv:2603.03401v1 [stat.ML] for this version) https://doi.org/10.48550/arXiv.2603.03401 Focus to learn more...

Originally published on March 05, 2026. Curated by AI News.

Machine Learning

[R] I trained a 3k parameter model on XOR sequences of length 20. It extrapolates perfectly to length 1,000,000. Here's why I think that's architecturally significant.

I've been working on an alternative to attention-based sequence modeling that I'm calling Geometric Flow Networks (GFN). The core idea: i...

Reddit - Machine Learning · 1 min · about 5 hours ago

Machine Learning

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before trai...

Reddit - Machine Learning · 1 min · about 8 hours ago

Ai Safety

I’ve come up with a new thought experiment to approach ASI, and it challenges the very notions of alignment and containment

I’ve written an essay exploring what I’m calling the Super-Intelligent Octopus Problem—a thought experiment designed to surface a paradox...

Reddit - Artificial Intelligence · 1 min · about 11 hours ago

Ai Safety

Bias in AI: Examples and 6 Ways to Fix it in 2026

AI bias is an anomaly in the output of ML algorithms due to prejudiced assumptions. Explore types of AI bias, examples, how to reduce bia...

AI Events · 36 min · about 18 hours ago

[2603.03401] Beyond Cross-Validation: Adaptive Parameter Selection for Kernel-Based Gradient Descents

About this article

Related Articles

[R] I trained a 3k parameter model on XOR sequences of length 20. It extrapolates perfectly to length 1,000,000. Here's why I think that's architecturally significant.

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

I’ve come up with a new thought experiment to approach ASI, and it challenges the very notions of alignment and containment

Bias in AI: Examples and 6 Ways to Fix it in 2026

No comments

Stay updated with AI News