Data Science

Data analysis, statistics, and data engineering

Top This Week

Mantis Biotech is making 'digital twins' of humans to help solve medicine's data availability problem | TechCrunch
Data Science

Mantis Biotech is making 'digital twins' of humans to help solve medicine's data availability problem | TechCrunch

Mantis takes disparate sources of data to make synthetic datasets that can be used to build so-called "digital twins" of the human body, ...

TechCrunch - AI · 6 min ·
Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·

All Content

[2602.08470] Learning Credal Ensembles via Distributionally Robust Optimization
Machine Learning

[2602.08470] Learning Credal Ensembles via Distributionally Robust Optimization

This paper presents CreDRO, a novel approach to learning credal ensembles using distributionally robust optimization, enhancing model rob...

arXiv - Machine Learning · 4 min ·
[2505.19792] Types of Relations: Defining Analogies with Category Theory
Ai Agents

[2505.19792] Types of Relations: Defining Analogies with Category Theory

This paper explores the representation of knowledge through analogies using category theory, highlighting how features of domains can fac...

arXiv - AI · 3 min ·
[2602.00299] Agentic Framework for Epidemiological Modeling
Machine Learning

[2602.00299] Agentic Framework for Epidemiological Modeling

The paper introduces EPIAGENT, an innovative agentic framework for epidemiological modeling that automates the synthesis, calibration, an...

arXiv - Machine Learning · 3 min ·
[2601.22123] Learning Hamiltonian Flow Maps: Mean Flow Consistency for Large-Timestep Molecular Dynamics
Machine Learning

[2601.22123] Learning Hamiltonian Flow Maps: Mean Flow Consistency for Large-Timestep Molecular Dynamics

The paper introduces a novel framework for learning Hamiltonian Flow Maps that enables stable large-timestep updates in molecular dynamic...

arXiv - Machine Learning · 3 min ·
[2504.01445] Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning
Llms

[2504.01445] Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning

The paper introduces Compositional-ARC, a dataset for evaluating systematic generalization in abstract spatial reasoning, demonstrating t...

arXiv - AI · 4 min ·
[2601.18231] Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting
Machine Learning

[2601.18231] Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting

This paper presents a framework for optimizing cross-modal fine-tuning by addressing the interaction between feature alignment and target...

arXiv - Machine Learning · 4 min ·
[2601.11670] A Confidence-Variance Theory for Pseudo-Label Selection in Semi-Supervised Learning
Machine Learning

[2601.11670] A Confidence-Variance Theory for Pseudo-Label Selection in Semi-Supervised Learning

This paper presents a novel Confidence-Variance (CoVar) theory for pseudo-label selection in semi-supervised learning, addressing the lim...

arXiv - Machine Learning · 4 min ·
[2602.23335] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset
Llms

[2602.23335] Understanding Usage and Engagement in AI-Powered Scientific Research Tools: The Asta Interaction Dataset

This paper presents the Asta Interaction Dataset, analyzing over 200,000 user queries from AI-powered research tools to understand user e...

arXiv - AI · 4 min ·
[2510.27480] Simplex-to-Euclidean Bijections for Categorical Flow Matching
Machine Learning

[2510.27480] Simplex-to-Euclidean Bijections for Categorical Flow Matching

The paper presents a novel method for learning and sampling from probability distributions on the simplex, utilizing smooth bijections to...

arXiv - Machine Learning · 3 min ·
[2509.22935] Compute-Optimal Quantization-Aware Training
Machine Learning

[2509.22935] Compute-Optimal Quantization-Aware Training

This paper explores Compute-Optimal Quantization-Aware Training (QAT), revealing how optimal compute allocation between full-precision an...

arXiv - Machine Learning · 4 min ·
[2509.21725] Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems
Machine Learning

[2509.21725] Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems

This paper presents an information-theoretic approach to Bayesian optimization for bilevel optimization problems, addressing the complexi...

arXiv - Machine Learning · 3 min ·
[2602.23259] Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving
Machine Learning

[2602.23259] Risk-Aware World Model Predictive Control for Generalizable End-to-End Autonomous Driving

This paper presents the Risk-aware World Model Predictive Control (RaWMPC) framework aimed at enhancing generalization in end-to-end auto...

arXiv - AI · 4 min ·
[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model
Llms

[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

This article presents rBridge, a small proxy model that predicts reasoning performance in large language models (LLMs), demonstrating sig...

arXiv - Machine Learning · 4 min ·
[2509.15429] Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data
Ai Safety

[2509.15429] Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

This paper presents a Random Matrix Theory-guided approach to sparse PCA for single-cell RNA-seq data, enhancing dimensionality reduction...

arXiv - Machine Learning · 4 min ·
[2509.03810] Online time series prediction using feature adjustment
Machine Learning

[2509.03810] Online time series prediction using feature adjustment

The paper presents a novel approach to online time series prediction, addressing challenges related to distribution shifts and delayed fe...

arXiv - Machine Learning · 4 min ·
[2508.01101] Fast and Flexible Probabilistic Forecasting of Dynamical Systems using Flow Matching and Physical Perturbation
Machine Learning

[2508.01101] Fast and Flexible Probabilistic Forecasting of Dynamical Systems using Flow Matching and Physical Perturbation

This article presents a novel framework for probabilistic forecasting of dynamical systems, utilizing flow matching and physical perturba...

arXiv - Machine Learning · 4 min ·
[2507.03772] Skewed Score: A statistical framework to assess autograders
Llms

[2507.03772] Skewed Score: A statistical framework to assess autograders

The paper presents a statistical framework for assessing autograders used in evaluating LLM outputs, addressing reliability and bias issu...

arXiv - Machine Learning · 4 min ·
[2506.15190] Learning Task-Agnostic Motifs to Capture the Continuous Nature of Animal Behavior
Computer Vision

[2506.15190] Learning Task-Agnostic Motifs to Capture the Continuous Nature of Animal Behavior

The paper presents a novel framework, Motif-based Continuous Dynamics (MCD), to model animal behavior by identifying continuous motor mot...

arXiv - Machine Learning · 4 min ·
[2505.24403] On the Lipschitz Continuity of Set Aggregation Functions and Neural Networks for Sets
Machine Learning

[2505.24403] On the Lipschitz Continuity of Set Aggregation Functions and Neural Networks for Sets

This paper explores the Lipschitz continuity of set aggregation functions and neural networks designed for set data, providing insights i...

arXiv - Machine Learning · 4 min ·
[2505.16952] FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization
Machine Learning

[2505.16952] FrontierCO: Real-World and Large-Scale Evaluation of Machine Learning Solvers for Combinatorial Optimization

The paper presents FrontierCO, a benchmark for evaluating machine learning solvers in combinatorial optimization, emphasizing real-world ...

arXiv - Machine Learning · 4 min ·
Previous Page 29 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime