[2602.12534] Linear Regression with Unknown Truncation Beyond Gaussian Features

[2602.12534] Linear Regression with Unknown Truncation Beyond Gaussian Features

arXiv - Machine Learning 4 min read Article

Summary

This paper presents a novel algorithm for truncated linear regression when the survival set is unknown, improving efficiency and expanding applicability beyond Gaussian assumptions.

Why It Matters

Understanding truncated linear regression with unknown survival sets is crucial for real-world applications where data may not fit ideal distributions. This research advances the field by providing a practical solution that requires fewer assumptions, potentially impacting various domains within statistics and machine learning.

Key Takeaways

  • Introduces the first algorithm for truncated linear regression with unknown survival sets.
  • The algorithm operates in polynomial time relative to the number of features and desired accuracy.
  • It requires only sub-Gaussian feature vectors, broadening its applicability.
  • The work contributes to positive-only PAC learning, which may have further implications in machine learning.
  • Addresses a significant gap in existing literature by tackling unknown survival sets.

Statistics > Machine Learning arXiv:2602.12534 (stat) [Submitted on 13 Feb 2026] Title:Linear Regression with Unknown Truncation Beyond Gaussian Features Authors:Alexandros Kouridakis, Anay Mehrotra, Alkis Kalavasis, Constantine Caramanis View a PDF of the paper titled Linear Regression with Unknown Truncation Beyond Gaussian Features, by Alexandros Kouridakis and 3 other authors View PDF HTML (experimental) Abstract:In truncated linear regression, samples $(x,y)$ are shown only when the outcome $y$ falls inside a certain survival set $S^\star$ and the goal is to estimate the unknown $d$-dimensional regressor $w^\star$. This problem has a long history of study in Statistics and Machine Learning going back to the works of (Galton, 1897; Tobin, 1958) and more recently in, e.g., (Daskalakis et al., 2019; 2021; Lee et al., 2023; 2024). Despite this long history, however, most prior works are limited to the special case where $S^\star$ is precisely known. The more practically relevant case, where $S^\star$ is unknown and must be learned from data, remains open: indeed, here the only available algorithms require strong assumptions on the distribution of the feature vectors (e.g., Gaussianity) and, even then, have a $d^{\mathrm{poly} (1/\varepsilon)}$ run time for achieving $\varepsilon$ accuracy. In this work, we give the first algorithm for truncated linear regression with unknown survival set that runs in $\mathrm{poly} (d/\varepsilon)$ time, by only requiring that the feature...

Related Articles

Machine Learning

Can I trick a public AI to spit out an outcome I prefer?

I am aware of an organization that evaluates proposals by feeding them into a public version of AI. Is there a way to make that AI rate m...

Reddit - Artificial Intelligence · 1 min ·
Llms

Curated 550+ free AI tools useful for building projects (LLMs, APIs, local models, RAG, agents)

Over the last few days I was collecting free or low cost AI tools that are actually useful if you want to build stuff, not just try rando...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Artificial intelligence - Machine Learning, Robotics, Algorithms

AI Events ·
Machine Learning

Fed Chair Jerome Powell, Treasury's Bessent and top bank CEOs met over Anthropic's Mythos model

submitted by /u/esporx [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime