Data Science

Data analysis, statistics, and data engineering

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 7 minutes ago

Machine Learning

[D] ICML 2026 Average Score

Hi all, I’m curious about the current review dynamics for ICML 2026, especially after the rebuttal phase. For those who are reviewers (or...

Reddit - Machine Learning · 1 min · about 14 hours ago

Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min · about 23 hours ago

All Content

Machine Learning

[2508.09639] UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles

The paper presents UbiQTree, a method for decomposing uncertainty in SHAP values used in explainable AI, focusing on aleatoric and episte...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2508.03616] Hidden Dynamics of Massive Activations in Transformer Training

This paper analyzes the emergence of massive activations during transformer training, revealing predictable patterns and offering a frame...

arXiv - AI · 3 min · about 1 month ago

Data Science

[2506.13792] ICE-ID: A Novel Historical Census Dataset for Longitudinal Identity Resolution

ICE-ID is a comprehensive historical census dataset featuring over 984,000 records from 16 census waves in Iceland, aimed at improving lo...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.21204] Test-Time Training with KV Binding Is Secretly Linear Attention

This paper explores the concept of Test-Time Training (TTT) with KV binding, revealing that it functions as learned linear attention rath...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.21165] PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data

PVminer is a novel NLP framework designed to detect the patient voice in patient-generated data, improving the analysis of patient-provid...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.21136] SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery

The paper presents SparkMe, a multi-agent LLM system designed for adaptive semi-structured interviewing, enhancing qualitative data colle...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.21052] Position-Aware Sequential Attention for Accurate Next Item Recommendations

The paper presents a novel kernelized self-attention mechanism designed to enhance next-item recommendations by improving the representat...

arXiv - Machine Learning · 3 min · about 1 month ago

Computer Vision

[2602.20994] Multimodal MRI Report Findings Supervised Brain Lesion Segmentation with Substructures

This paper presents a novel approach to brain lesion segmentation in MRI scans using report-supervised learning, enhancing accuracy by in...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20971] Does Order Matter : Connecting The Law of Robustness to Robust Generalization

This paper explores the relationship between the law of robustness and robust generalization in machine learning, providing a framework t...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20951] See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

This paper presents ArtiAgent, a novel approach to automate the creation of artifact-annotated datasets for training visual language mode...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.20945] The Art of Efficient Reasoning: Data, Reward, and Optimization

This article explores efficient reasoning in Large Language Models (LLMs), focusing on optimizing computational resources through reward ...

arXiv - AI · 4 min · about 1 month ago

Nlp

[2602.20877] E-MMKGR: A Unified Multimodal Knowledge Graph Framework for E-commerce Applications

The paper presents E-MMKGR, a unified framework for multimodal knowledge graphs tailored for e-commerce, enhancing recommendation systems...

arXiv - AI · 3 min · about 1 month ago

Computer Vision

[2602.20924] Airavat: An Agentic Framework for Internet Measurement

Airavat introduces an innovative framework for automating Internet measurement workflows, ensuring both generation and verification again...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.20752] OrthoDiffusion: A Generalizable Multi-Task Diffusion Foundation Model for Musculoskeletal MRI Interpretation

OrthoDiffusion is a novel diffusion-based model designed for multi-task interpretation of musculoskeletal MRI scans, improving diagnostic...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.20744] Voices of the Mountains: Deep Learning-Based Vocal Error Detection System for Kurdish Maqams

This article presents a deep learning-based system for detecting vocal errors in Kurdish maqams, addressing the limitations of existing a...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.20709] Onboard-Targeted Segmentation of Straylight in Space Camera Sensors

This paper presents an AI-driven methodology for segmenting straylight effects in space camera sensors, enhancing image analysis in resou...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.20677] UrbanFM: Scaling Urban Spatio-Temporal Foundation Models

The paper presents UrbanFM, a novel framework for scaling urban spatio-temporal foundation models, addressing challenges in generalizabil...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20650] Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression

The paper presents Dataset Color Quantization (DCQ), a framework designed to compress large-scale image datasets by reducing color-space ...

arXiv - AI · 3 min · about 1 month ago

Ai Safety

[2602.20541] Maximin Share Guarantees via Limited Cost-Sensitive Sharing

This paper explores fair allocation of indivisible goods through limited cost-sensitive sharing, demonstrating how controlled sharing can...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.20532] Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training

The paper presents ACTOR-CURATOR, a novel framework for curriculum learning in reinforcement learning, enhancing post-training for large ...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 61 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Data Science

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

[D] ICML 2026 Average Score

Accelerating science with AI and simulations

All Content

[2508.09639] UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles

[2508.03616] Hidden Dynamics of Massive Activations in Transformer Training

[2506.13792] ICE-ID: A Novel Historical Census Dataset for Longitudinal Identity Resolution

[2602.21204] Test-Time Training with KV Binding Is Secretly Linear Attention

[2602.21165] PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data

[2602.21136] SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery

[2602.21052] Position-Aware Sequential Attention for Accurate Next Item Recommendations

[2602.20994] Multimodal MRI Report Findings Supervised Brain Lesion Segmentation with Substructures

[2602.20971] Does Order Matter : Connecting The Law of Robustness to Robust Generalization

[2602.20951] See and Fix the Flaws: Enabling VLMs and Diffusion Models to Comprehend Visual Artifacts via Agentic Data Synthesis

[2602.20945] The Art of Efficient Reasoning: Data, Reward, and Optimization

[2602.20877] E-MMKGR: A Unified Multimodal Knowledge Graph Framework for E-commerce Applications

[2602.20924] Airavat: An Agentic Framework for Internet Measurement

[2602.20752] OrthoDiffusion: A Generalizable Multi-Task Diffusion Foundation Model for Musculoskeletal MRI Interpretation

[2602.20744] Voices of the Mountains: Deep Learning-Based Vocal Error Detection System for Kurdish Maqams

[2602.20709] Onboard-Targeted Segmentation of Straylight in Space Camera Sensors

[2602.20677] UrbanFM: Scaling Urban Spatio-Temporal Foundation Models

[2602.20650] Dataset Color Quantization: A Training-Oriented Framework for Dataset-Level Compression

[2602.20541] Maximin Share Guarantees via Limited Cost-Sensitive Sharing

[2602.20532] Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training

Related Topics

Stay updated with AI News