[2603.26611] Benchmarking Tabular Foundation Models for Conditional

[2603.26611] Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression

arXiv - Machine Learning March 30, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.26611: Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression

Computer Science > Machine Learning arXiv:2603.26611 (cs) [Submitted on 27 Mar 2026] Title:Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression Authors:Rafael Izbicki, Pedro L. C. Rodrigues View a PDF of the paper titled Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression, by Rafael Izbicki and Pedro L. C. Rodrigues View PDF HTML (experimental) Abstract:Conditional density estimation (CDE) - recovering the full conditional distribution of a response given tabular covariates - is essential in settings with heteroscedasticity, multimodality, or asymmetric uncertainty. Recent tabular foundation models, such as TabPFN and TabICL, naturally produce predictive distributions, but their effectiveness as general-purpose CDE methods has not been systematically evaluated, unlike their performance for point prediction, which is well studied. We benchmark three tabular foundation model variants against a diverse set of parametric, tree-based, and neural CDE baselines on 39 real-world datasets, across training sizes from 50 to 20,000, using six metrics covering density accuracy, calibration, and computation time. Across all sample sizes, foundation models achieve the best CDE loss, log-likelihood, and CRPS on the large majority of datasets tested. Calibration is competitive at small sample sizes but, for some metrics and datasets, lags behind task-specific neural baselines at larger sample sizes, suggesting that post...

Originally published on March 30, 2026. Curated by AI News.

Llms

My AI spent last night modifying its own codebase

I've been working on a local AI system called Apis that runs completely offline through Ollama. During a background run, Apis identified ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Fake users generated by AI can't simulate humans — review of 182 research papers. Your thoughts?

https://www.researchsquare.com/article/rs-9057643/v1 There’s a massive trend right now where tech companies, businesses, even researchers...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

TL;DR: Removing the right transformer layers (instead of shrinking all layers) gives smaller, faster models with minimal quality loss — a...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Abstract page for arXiv paper 2603.23966: Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

arXiv - AI · 4 min · about 5 hours ago

[2603.26611] Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression

About this article

Related Articles

My AI spent last night modifying its own codebase

Fake users generated by AI can't simulate humans — review of 182 research papers. Your thoughts?

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

No comments

Stay updated with AI News