Data Science

Data analysis, statistics, and data engineering

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min ·

All Content

[2603.03275] Learning Demographic-Conditioned Mobility Trajectories with Aggregate Supervision
Machine Learning

[2603.03275] Learning Demographic-Conditioned Mobility Trajectories with Aggregate Supervision

Abstract page for arXiv paper 2603.03275: Learning Demographic-Conditioned Mobility Trajectories with Aggregate Supervision

arXiv - Machine Learning · 3 min ·
[2603.03230] SynthCharge: An Electric Vehicle Routing Instance Generator with Feasibility Screening to Enable Learning-Based Optimization and Benchmarking
Machine Learning

[2603.03230] SynthCharge: An Electric Vehicle Routing Instance Generator with Feasibility Screening to Enable Learning-Based Optimization and Benchmarking

Abstract page for arXiv paper 2603.03230: SynthCharge: An Electric Vehicle Routing Instance Generator with Feasibility Screening to Enabl...

arXiv - AI · 3 min ·
[2603.03207] I-CAM-UV: Integrating Causal Graphs over Non-Identical Variable Sets Using Causal Additive Models with Unobserved Variables
Machine Learning

[2603.03207] I-CAM-UV: Integrating Causal Graphs over Non-Identical Variable Sets Using Causal Additive Models with Unobserved Variables

Abstract page for arXiv paper 2603.03207: I-CAM-UV: Integrating Causal Graphs over Non-Identical Variable Sets Using Causal Additive Mode...

arXiv - Machine Learning · 4 min ·
[2603.03206] Understanding and Mitigating Dataset Corruption in LLM Steering
Llms

[2603.03206] Understanding and Mitigating Dataset Corruption in LLM Steering

Abstract page for arXiv paper 2603.03206: Understanding and Mitigating Dataset Corruption in LLM Steering

arXiv - AI · 4 min ·
[2603.03172] Less Noise, Same Certificate: Retain Sensitivity for Unlearning
Machine Learning

[2603.03172] Less Noise, Same Certificate: Retain Sensitivity for Unlearning

Abstract page for arXiv paper 2603.03172: Less Noise, Same Certificate: Retain Sensitivity for Unlearning

arXiv - Machine Learning · 4 min ·
[2603.02411] From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness
Machine Learning

[2603.02411] From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Precision and Compactness

Abstract page for arXiv paper 2603.02411: From Fewer Samples to Fewer Bits: Reframing Dataset Distillation as Joint Optimization of Preci...

arXiv - Machine Learning · 3 min ·
[2603.03056] Incremental Graph Construction Enables Robust Spectral Clustering of Texts
Nlp

[2603.03056] Incremental Graph Construction Enables Robust Spectral Clustering of Texts

Abstract page for arXiv paper 2603.03056: Incremental Graph Construction Enables Robust Spectral Clustering of Texts

arXiv - Machine Learning · 3 min ·
[2603.02252] Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics
Machine Learning

[2603.02252] Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics

Abstract page for arXiv paper 2603.02252: Whisper-RIR-Mega: A Paired Clean-Reverberant Speech Benchmark for ASR Robustness to Room Acoustics

arXiv - Machine Learning · 3 min ·
[2603.02935] Contextual Latent World Models for Offline Meta Reinforcement Learning
Machine Learning

[2603.02935] Contextual Latent World Models for Offline Meta Reinforcement Learning

Abstract page for arXiv paper 2603.02935: Contextual Latent World Models for Offline Meta Reinforcement Learning

arXiv - Machine Learning · 3 min ·
[2603.02840] Adapting Time Series Foundation Models through Data Mixtures
Llms

[2603.02840] Adapting Time Series Foundation Models through Data Mixtures

Abstract page for arXiv paper 2603.02840: Adapting Time Series Foundation Models through Data Mixtures

arXiv - Machine Learning · 4 min ·
[2603.02756] Rethinking Time Series Domain Generalization via Structure-Stratified Calibration
Ai Safety

[2603.02756] Rethinking Time Series Domain Generalization via Structure-Stratified Calibration

Abstract page for arXiv paper 2603.02756: Rethinking Time Series Domain Generalization via Structure-Stratified Calibration

arXiv - Machine Learning · 3 min ·
[2603.02212] GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning
Machine Learning

[2603.02212] GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning

Abstract page for arXiv paper 2603.02212: GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning

arXiv - AI · 3 min ·
[2603.03072] TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning
Llms

[2603.03072] TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

Abstract page for arXiv paper 2603.03072: TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

arXiv - AI · 4 min ·
[2603.02702] FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing
Nlp

[2603.02702] FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing

Abstract page for arXiv paper 2603.02702: FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing

arXiv - Machine Learning · 4 min ·
[2603.02237] Concept Heterogeneity-aware Representation Steering
Llms

[2603.02237] Concept Heterogeneity-aware Representation Steering

Abstract page for arXiv paper 2603.02237: Concept Heterogeneity-aware Representation Steering

arXiv - AI · 4 min ·
[2603.02239] Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents
Llms

[2603.02239] Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

Abstract page for arXiv paper 2603.02239: Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foun...

arXiv - AI · 4 min ·
[2603.02221] MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction
Llms

[2603.02221] MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabular Prediction

Abstract page for arXiv paper 2603.02221: MedFeat: Model-Aware and Explainability-Driven Feature Engineering with LLMs for Clinical Tabul...

arXiv - AI · 4 min ·
[2603.02215] RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning
Llms

[2603.02215] RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchical Curriculum Learning

Abstract page for arXiv paper 2603.02215: RxnNano:Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction via Hierarchi...

arXiv - AI · 4 min ·
Llms

[D] Quantified analysis of 2,218 Gary Marcus claims - two independent LLM pipelines, scored against evidence

Built a dataset scoring every testable claim from Marcus's 474 Substack posts. Two pipelines (Claude Opus 4.6 and ChatGPT Codex) analyzed...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I trained Qwen2.5-1.5b with RLVR (GRPO) vs SFT and compared benchmark performance

Hello everyone. I trained Qwen2.5-1.5b-Instruct with both RLVR and SFT on the GSM8K dataset and compared the results across GSM8K and MAT...

Reddit - Machine Learning · 1 min ·
Previous Page 17 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime