Data Science

Data analysis, statistics, and data engineering

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Data Science

Mantis Biotech is making 'digital twins' of humans to help solve medicine's data availability problem | TechCrunch

Mantis takes disparate sources of data to make synthetic datasets that can be used to build so-called "digital twins" of the human body, ...

TechCrunch - AI · 6 min · about 7 hours ago

Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min · about 8 hours ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 15 hours ago

All Content

Machine Learning

[2602.08470] Learning Credal Ensembles via Distributionally Robust Optimization

This paper presents CreDRO, a novel approach to learning credal ensembles using distributionally robust optimization, enhancing model rob...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Agents

[2505.19792] Types of Relations: Defining Analogies with Category Theory

This paper explores the representation of knowledge through analogies using category theory, highlighting how features of domains can fac...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.00299] Agentic Framework for Epidemiological Modeling

The paper introduces EPIAGENT, an innovative agentic framework for epidemiological modeling that automates the synthesis, calibration, an...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2601.22123] Learning Hamiltonian Flow Maps: Mean Flow Consistency for Large-Timestep Molecular Dynamics

The paper introduces a novel framework for learning Hamiltonian Flow Maps that enables stable large-timestep updates in molecular dynamic...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2504.01445] Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning

The paper introduces Compositional-ARC, a dataset for evaluating systematic generalization in abstract spatial reasoning, demonstrating t...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2601.18231] Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting

This paper presents a framework for optimizing cross-modal fine-tuning by addressing the interaction between feature alignment and target...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2601.11670] A Confidence-Variance Theory for Pseudo-Label Selection in Semi-Supervised Learning

This paper presents a novel Confidence-Variance (CoVar) theory for pseudo-label selection in semi-supervised learning, addressing the lim...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.23335] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

This paper presents the Asta Interaction Dataset, analyzing over 200,000 user queries from AI-powered research tools to understand user e...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2510.27480] Simplex-to-Euclidean Bijections for Categorical Flow Matching

The paper presents a novel method for learning and sampling from probability distributions on the simplex, utilizing smooth bijections to...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2509.22935] Compute-Optimal Quantization-Aware Training

This paper explores Compute-Optimal Quantization-Aware Training (QAT), revealing how optimal compute allocation between full-precision an...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2509.21725] Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems

This paper presents an information-theoretic approach to Bayesian optimization for bilevel optimization problems, addressing the complexi...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.23259] Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving

This paper presents the Risk-aware World Model Predictive Control (RaWMPC) framework aimed at enhancing generalization in end-to-end auto...

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

This article presents rBridge, a small proxy model that predicts reasoning performance in large language models (LLMs), demonstrating sig...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Safety

[2509.15429] Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

This paper presents a Random Matrix Theory-guided approach to sparse PCA for single-cell RNA-seq data, enhancing dimensionality reduction...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2509.03810] Online time series prediction using feature adjustment

The paper presents a novel approach to online time series prediction, addressing challenges related to distribution shifts and delayed fe...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2508.01101] Fast and Flexible Probabilistic Forecasting of Dynamical Systems using Flow Matching and Physical Perturbation

This article presents a novel framework for probabilistic forecasting of dynamical systems, utilizing flow matching and physical perturba...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2507.03772] Skewed Score: A statistical framework to assess autograders

The paper presents a statistical framework for assessing autograders used in evaluating LLM outputs, addressing reliability and bias issu...

arXiv - Machine Learning · 4 min · about 1 month ago

Computer Vision

[2506.15190] Learning Task-Agnostic Motifs to Capture the Continuous Nature of Animal Behavior

The paper presents a novel framework, Motif-based Continuous Dynamics (MCD), to model animal behavior by identifying continuous motor mot...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2505.24403] On the Lipschitz Continuity of Set Aggregation Functions and Neural Networks for Sets

This paper explores the Lipschitz continuity of set aggregation functions and neural networks designed for set data, providing insights i...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2505.16952] FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization

The paper presents FrontierCO, a benchmark for evaluating machine learning solvers in combinatorial optimization, emphasizing real-world ...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 29 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Data Science

Top This Week

Mantis Biotech is making 'digital twins' of humans to help solve medicine's data availability problem | TechCrunch

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

UMKC Announces New Master of Science in Artificial Intelligence

All Content

[2602.08470] Learning Credal Ensembles via Distributionally Robust Optimization

[2505.19792] Types of Relations: Defining Analogies with Category Theory

[2602.00299] Agentic Framework for Epidemiological Modeling

[2601.22123] Learning Hamiltonian Flow Maps: Mean Flow Consistency for Large-Timestep Molecular Dynamics

[2504.01445] Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning

[2601.18231] Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting

[2601.11670] A Confidence-Variance Theory for Pseudo-Label Selection in Semi-Supervised Learning

[2602.23335] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

[2510.27480] Simplex-to-Euclidean Bijections for Categorical Flow Matching

[2509.22935] Compute-Optimal Quantization-Aware Training

[2509.21725] Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems

[2602.23259] Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving

[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

[2509.15429] Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

[2509.03810] Online time series prediction using feature adjustment

[2508.01101] Fast and Flexible Probabilistic Forecasting of Dynamical Systems using Flow Matching and Physical Perturbation

[2507.03772] Skewed Score: A statistical framework to assess autograders

[2506.15190] Learning Task-Agnostic Motifs to Capture the Continuous Nature of Animal Behavior

[2505.24403] On the Lipschitz Continuity of Set Aggregation Functions and Neural Networks for Sets

[2505.16952] FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization

Related Topics

Stay updated with AI News