Data Science

Data analysis, statistics, and data engineering

Top This Week

Machine Learning

What image/video training data is hardest to find right now? [R]

I'm building a crowdsourced photo collection platform (contributors take photos with smartphones, we auto-label with YOLO/CLIP + enrich w...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·

All Content

[2602.14456] Traceable Latent Variable Discovery Based on Multi-Agent Collaboration
Ai Agents

[2602.14456] Traceable Latent Variable Discovery Based on Multi-Agent Collaboration

The paper presents TLVD, a novel causal modeling framework that integrates large language models with traditional causal discovery algori...

arXiv - Machine Learning · 4 min ·
[2602.14432] S2D: Selective Spectral Decay for Quantization-Friendly Conditioning of Neural Activations
Machine Learning

[2602.14432] S2D: Selective Spectral Decay for Quantization-Friendly Conditioning of Neural Activations

The paper introduces Selective Spectral Decay (S2D), a method to improve quantization in neural networks by addressing activation outlier...

arXiv - AI · 4 min ·
[2602.14430] A unified framework for evaluating the robustness of machine-learning interpretability for prospect risking
Machine Learning

[2602.14430] A unified framework for evaluating the robustness of machine-learning interpretability for prospect risking

This article presents a unified framework for evaluating the robustness of machine-learning interpretability, specifically in the context...

arXiv - Machine Learning · 4 min ·
[2602.14423] The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization
Machine Learning

[2602.14423] The geometry of invariant learning: an information-theoretic analysis of data augmentation and generalization

This article presents an information-theoretic framework analyzing the role of data augmentation in machine learning, focusing on its imp...

arXiv - AI · 4 min ·
[2602.14375] A Study on Multi-Class Online Fuzzy Classifiers for Dynamic Environments
Machine Learning

[2602.14375] A Study on Multi-Class Online Fuzzy Classifiers for Dynamic Environments

This paper presents a multi-class online fuzzy classifier designed for dynamic environments, extending traditional two-class fuzzy classi...

arXiv - Machine Learning · 3 min ·
[2602.13282] GraFSTNet: Graph-based Frequency SpatioTemporal Network for Cellular Traffic Prediction
Machine Learning

[2602.13282] GraFSTNet: Graph-based Frequency SpatioTemporal Network for Cellular Traffic Prediction

The paper presents GraFSTNet, a novel framework for cellular traffic prediction that integrates spatio-temporal modeling with time-freque...

arXiv - AI · 3 min ·
[2602.13279] LLM-Enhanced Rumor Detection via Virtual Node Induced Edge Prediction
Llms

[2602.13279] LLM-Enhanced Rumor Detection via Virtual Node Induced Edge Prediction

This paper presents a novel framework for rumor detection on social networks, utilizing Large Language Models (LLMs) to enhance the ident...

arXiv - AI · 3 min ·
[2602.13259] Learning Physiology-Informed Vocal Spectrotemporal Representations for Speech Emotion Recognition
Machine Learning

[2602.13259] Learning Physiology-Informed Vocal Spectrotemporal Representations for Speech Emotion Recognition

This paper presents PhysioSER, a novel approach for speech emotion recognition that integrates physiological insights into vocal represen...

arXiv - AI · 4 min ·
[2602.13249] Boltz is a Strong Baseline for Atom-level Representation Learning
Llms

[2602.13249] Boltz is a Strong Baseline for Atom-level Representation Learning

The paper presents Boltz as a competitive baseline for atom-level representation learning in molecular tasks, particularly in ADMET prope...

arXiv - Machine Learning · 3 min ·
[2602.13246] Global AI Bias Audit for Technical Governance
Llms

[2602.13246] Global AI Bias Audit for Technical Governance

This article discusses a global audit of Large Language Models (LLMs) focusing on geographic and socioeconomic biases in AI governance, h...

arXiv - AI · 4 min ·
[2602.14274] Integrating Unstructured Text into Causal Inference: Empirical Evidence from Real Data
Llms

[2602.14274] Integrating Unstructured Text into Causal Inference: Empirical Evidence from Real Data

This paper presents a framework for integrating unstructured text into causal inference, demonstrating its effectiveness against traditio...

arXiv - AI · 3 min ·
[2602.14272] Radial-VCReg: More Informative Representation Learning Through Radial Gaussianization
Data Science

[2602.14272] Radial-VCReg: More Informative Representation Learning Through Radial Gaussianization

The paper presents Radial-VCReg, a novel approach to self-supervised learning that enhances representation learning by addressing the lim...

arXiv - Machine Learning · 3 min ·
[2602.14267] Cross-household Transfer Learning Approach with LSTM-based Demand Forecasting
Machine Learning

[2602.14267] Cross-household Transfer Learning Approach with LSTM-based Demand Forecasting

The paper presents DELTAiF, a transfer learning framework that enhances LSTM-based demand forecasting for household hot water consumption...

arXiv - AI · 4 min ·
[2602.14251] Multi-Agent Debate: A Unified Agentic Framework for Tabular Anomaly Detection
Llms

[2602.14251] Multi-Agent Debate: A Unified Agentic Framework for Tabular Anomaly Detection

The paper presents the Multi-Agent Debate (MAD) framework for tabular anomaly detection, leveraging multiple ML detectors and a large lan...

arXiv - AI · 4 min ·
[2602.14233] Evaluating LLMs in Finance Requires Explicit Bias Consideration
Llms

[2602.14233] Evaluating LLMs in Finance Requires Explicit Bias Consideration

This paper discusses the need for explicit bias consideration in evaluating Large Language Models (LLMs) used in finance, identifying fiv...

arXiv - AI · 3 min ·
[2602.14231] Robust multi-task boosting using clustering and local ensembling
Machine Learning

[2602.14231] Robust multi-task boosting using clustering and local ensembling

The paper presents Robust Multi-Task Boosting using Clustering and Local Ensembling (RMB-CLE), a framework that enhances multi-task learn...

arXiv - Machine Learning · 3 min ·
[2602.14208] Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws
Machine Learning

[2602.14208] Fast Catch-Up, Late Switching: Optimal Batch Size Scheduling via Functional Scaling Laws

This paper explores optimal batch size scheduling in deep learning, revealing that task difficulty influences the effectiveness of batch ...

arXiv - Machine Learning · 4 min ·
[2602.14200] TS-Haystack: A Multi-Scale Retrieval Benchmark for Time Series Language Models
Llms

[2602.14200] TS-Haystack: A Multi-Scale Retrieval Benchmark for Time Series Language Models

The paper introduces TS-Haystack, a benchmark for evaluating Time Series Language Models (TSLMs) on long-context retrieval tasks, address...

arXiv - Machine Learning · 4 min ·
[2602.15019] Hunt Globally: Deep Research AI Agents for Drug Asset Scouting in Investing, Business Development, and Search & Evaluation
Ai Infrastructure

[2602.15019] Hunt Globally: Deep Research AI Agents for Drug Asset Scouting in Investing, Business Development, and Search & Evaluation

The paper discusses the development of a Deep Research AI agent, Bioptic Agent, designed for drug asset scouting, particularly in non-U.S...

arXiv - AI · 4 min ·
[2602.14161] When Benchmarks Lie: Evaluating Malicious Prompt Classifiers Under True Distribution Shift
Llms

[2602.14161] When Benchmarks Lie: Evaluating Malicious Prompt Classifiers Under True Distribution Shift

This paper evaluates the effectiveness of malicious prompt classifiers under true distribution shifts, revealing significant performance ...

arXiv - Machine Learning · 4 min ·
Previous Page 142 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime