Data Science

Data analysis, statistics, and data engineering

Top This Week

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before trai...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I tested Meta’s brain-response model on posts. It predicted the Elon one almost perfectly.

I built an experimental UI and visualization layer around Meta’s open brain-response model just to see whether this stuff actually works ...

Reddit - Machine Learning · 1 min ·

All Content

[2603.05058] A 360-degree Multi-camera System for Blue Emergency Light Detection Using Color Attention RT-DETR and the ABLDataset
Data Science

[2603.05058] A 360-degree Multi-camera System for Blue Emergency Light Detection Using Color Attention RT-DETR and the ABLDataset

Abstract page for arXiv paper 2603.05058: A 360-degree Multi-camera System for Blue Emergency Light Detection Using Color Attention RT-DE...

arXiv - AI · 4 min ·
[2603.04718] AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments
Machine Learning

[2603.04718] AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments

Abstract page for arXiv paper 2603.04718: AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments

arXiv - AI · 4 min ·
[2603.04698] Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement
Llms

[2603.04698] Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

Abstract page for arXiv paper 2603.04698: Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

arXiv - AI · 3 min ·
[2603.04458] Learning Unified Distance Metric for Heterogeneous Attribute Data Clustering
Data Science

[2603.04458] Learning Unified Distance Metric for Heterogeneous Attribute Data Clustering

Abstract page for arXiv paper 2603.04458: Learning Unified Distance Metric for Heterogeneous Attribute Data Clustering

arXiv - Machine Learning · 4 min ·
[2603.04450] MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering
Machine Learning

[2603.04450] MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering

Abstract page for arXiv paper 2603.04450: MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering

arXiv - Machine Learning · 3 min ·
[2603.04449] An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical and Cognitive Data
Data Science

[2603.04449] An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical and Cognitive Data

Abstract page for arXiv paper 2603.04449: An Explainable Ensemble Framework for Alzheimer's Disease Prediction Using Structured Clinical ...

arXiv - Machine Learning · 4 min ·
[2603.05399] Judge Reliability Harness: Stress Testing the Reliability of LLM Judges
Llms

[2603.05399] Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

Abstract page for arXiv paper 2603.05399: Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

arXiv - AI · 3 min ·
[2603.05295] WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces
Nlp

[2603.05295] WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

Abstract page for arXiv paper 2603.05295: WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

arXiv - AI · 3 min ·
[2603.05120] Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning
Llms

[2603.05120] Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Reasoning

Abstract page for arXiv paper 2603.05120: Bidirectional Curriculum Generation: A Multi-Agent Framework for Data-Efficient Mathematical Re...

arXiv - AI · 3 min ·
[2603.05031] AegisUI: Behavioral Anomaly Detection for Structured User Interface Protocols in AI Agent Systems
Ai Agents

[2603.05031] AegisUI: Behavioral Anomaly Detection for Structured User Interface Protocols in AI Agent Systems

Abstract page for arXiv paper 2603.05031: AegisUI: Behavioral Anomaly Detection for Structured User Interface Protocols in AI Agent Systems

arXiv - AI · 4 min ·
[2603.04981] Rethinking Representativeness and Diversity in Dynamic Data Selection
Machine Learning

[2603.04981] Rethinking Representativeness and Diversity in Dynamic Data Selection

Abstract page for arXiv paper 2603.04981: Rethinking Representativeness and Diversity in Dynamic Data Selection

arXiv - AI · 4 min ·
[2603.04822] VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment
Llms

[2603.04822] VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

Abstract page for arXiv paper 2603.04822: VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

arXiv - AI · 4 min ·
[2603.04791] Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling
Llms

[2603.04791] Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

Abstract page for arXiv paper 2603.04791: Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

arXiv - AI · 4 min ·
[2603.04670] Using Vision + Language Models to Predict Item Difficulty
Llms

[2603.04670] Using Vision + Language Models to Predict Item Difficulty

Abstract page for arXiv paper 2603.04670: Using Vision + Language Models to Predict Item Difficulty

arXiv - AI · 3 min ·
[2603.04631] Towards automated data analysis: A guided framework for LLM-based risk estimation
Llms

[2603.04631] Towards automated data analysis: A guided framework for LLM-based risk estimation

Abstract page for arXiv paper 2603.04631: Towards automated data analysis: A guided framework for LLM-based risk estimation

arXiv - AI · 3 min ·
Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations
Open Source Ai

Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations

A Blog post by NXP on Hugging Face

Hugging Face Blog · 10 min ·
Machine Learning

[D] Impact of EU AI Act on your work?

Greetings r/MachineLearning. I am studying the impact of EU AI Act on data science practitioners, especially those working on models that...

Reddit - Machine Learning · 1 min ·
[2601.04646] Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search
Machine Learning

[2601.04646] Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search

Abstract page for arXiv paper 2601.04646: Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant ...

arXiv - AI · 4 min ·
[2510.15040] Composition-Grounded Data Synthesis for Visual Reasoning
Llms

[2510.15040] Composition-Grounded Data Synthesis for Visual Reasoning

Abstract page for arXiv paper 2510.15040: Composition-Grounded Data Synthesis for Visual Reasoning

arXiv - Machine Learning · 4 min ·
[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks
Machine Learning

[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

Abstract page for arXiv paper 2509.25095: Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

arXiv - Machine Learning · 4 min ·
Previous Page 13 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime