Data Science

Data analysis, statistics, and data engineering

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

I tried building a memory-first AI… and ended up discovering smaller models can beat larger ones

Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[R] Are there ML approaches for prioritizing and routing “important” signals across complex systems?

I’ve been reading more about attention mechanisms in transformers and how they effectively learn to weight and prioritize relevant inputs...

Reddit - Machine Learning · 1 min ·

All Content

[2602.23219] Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime
Machine Learning

[2602.23219] Takeuchi's Information Criteria as Generalization Measures for DNNs Close to NTK Regime

This paper investigates Takeuchi's Information Criterion (TIC) as a measure for generalization in deep neural networks (DNNs) near the ne...

arXiv - Machine Learning · 4 min ·
[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck
Nlp

[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

This paper presents a novel approach to disaster recovery in distributed storage systems, addressing the limitations of cryptographic has...

arXiv - AI · 3 min ·
[2602.22235] Unsupervised Denoising of Diffusion-Weighted Images with Bias and Variance Corrected Noise Modeling
Machine Learning

[2602.22235] Unsupervised Denoising of Diffusion-Weighted Images with Bias and Variance Corrected Noise Modeling

This article presents a novel approach for unsupervised denoising of diffusion-weighted images (dMRI) by addressing noise bias and varian...

arXiv - AI · 4 min ·
[2602.23188] Efficient Real-Time Adaptation of ROMs for Unsteady Flows Using Data Assimilation
Machine Learning

[2602.23188] Efficient Real-Time Adaptation of ROMs for Unsteady Flows Using Data Assimilation

This article presents a novel retraining strategy for Reduced Order Models (ROMs) that enhances real-time adaptation for unsteady flows u...

arXiv - Machine Learning · 4 min ·
[2602.23182] Closing the gap on tabular data with Fourier and Implicit Categorical Features
Machine Learning

[2602.23182] Closing the gap on tabular data with Fourier and Implicit Categorical Features

This paper explores how deep learning can better handle tabular data by addressing its limitations compared to tree-based methods, partic...

arXiv - Machine Learning · 4 min ·
[2602.23179] Induction Meets Biology: Mechanisms of Repeat Detection in Protein Language Models
Llms

[2602.23179] Induction Meets Biology: Mechanisms of Repeat Detection in Protein Language Models

This article explores how protein language models (PLMs) detect repeating segments in protein sequences, revealing mechanisms for identif...

arXiv - Machine Learning · 3 min ·
[2602.22224] DS SERVE: A Framework for Efficient and Scalable Neural Retrieval
Machine Learning

[2602.22224] DS SERVE: A Framework for Efficient and Scalable Neural Retrieval

DS SERVE is a framework designed to enhance neural retrieval systems by efficiently processing large-scale text datasets, achieving low l...

arXiv - AI · 3 min ·
[2602.23159] Benchmarking Temporal Web3 Intelligence: Lessons from the FinSurvival 2025 Challenge
Data Science

[2602.23159] Benchmarking Temporal Web3 Intelligence: Lessons from the FinSurvival 2025 Challenge

The paper presents the FinSurvival 2025 Challenge, focusing on benchmarking temporal Web3 intelligence using 21.8 million transaction rec...

arXiv - Machine Learning · 4 min ·
[2602.22221] Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews
Llms

[2602.22221] Misinformation Exposure in the Chinese Web: A Cross-System Evaluation of Search Engines, LLMs, and AI Overviews

This article evaluates misinformation exposure on the Chinese web by comparing traditional search engines, LLMs, and AI-generated overvie...

arXiv - AI · 3 min ·
[2602.23146] Partial recovery of meter-scale surface weather
Machine Learning

[2602.23146] Partial recovery of meter-scale surface weather

The paper discusses a method for recovering meter-scale surface weather data by integrating sparse surface measurements with high-resolut...

arXiv - Machine Learning · 4 min ·
[2602.23142] Prediction of Diffusion Coefficients in Mixtures with Tensor Completion
Machine Learning

[2602.23142] Prediction of Diffusion Coefficients in Mixtures with Tensor Completion

This paper presents a hybrid tensor completion method for predicting temperature-dependent diffusion coefficients in binary mixtures, enh...

arXiv - Machine Learning · 4 min ·
[2602.23135] DyGnROLE: Modeling Asymmetry in Dynamic Graphs with Node-Role-Oriented Latent Encoding
Machine Learning

[2602.23135] DyGnROLE: Modeling Asymmetry in Dynamic Graphs with Node-Role-Oriented Latent Encoding

The paper presents DyGnROLE, a transformer-based model for dynamic graphs that distinguishes between source and destination nodes to impr...

arXiv - AI · 3 min ·
[2602.23128] Bound to Disagree : Generalization Bounds via Certifiable Surrogates
Machine Learning

[2602.23128] Bound to Disagree : Generalization Bounds via Certifiable Surrogates

The paper presents new disagreement-based certificates for generalization bounds in deep learning models, addressing limitations of exist...

arXiv - Machine Learning · 3 min ·
[2602.22213] Enriching Taxonomies Using Large Language Models
Llms

[2602.22213] Enriching Taxonomies Using Large Language Models

The paper presents Taxoria, a novel pipeline that enhances existing taxonomies using Large Language Models (LLMs), addressing issues of l...

arXiv - AI · 3 min ·
[2602.23113] Learning Physical Operators using Neural Operators
Machine Learning

[2602.23113] Learning Physical Operators using Neural Operators

This paper presents a novel physics-informed training framework for neural operators that enhances their ability to generalize beyond tra...

arXiv - Machine Learning · 3 min ·
[2602.23330] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks
Llms

[2602.23330] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

This article presents a multi-agent LLM framework for financial trading, emphasizing fine-grained task decomposition to enhance decision-...

arXiv - AI · 4 min ·
[2602.23329] LLM Novice Uplift on Dual-Use, In Silico Biology Tasks
Llms

[2602.23329] LLM Novice Uplift on Dual-Use, In Silico Biology Tasks

This article examines the effectiveness of large language models (LLMs) in enhancing novice users' performance on complex biological task...

arXiv - AI · 4 min ·
[2602.23089] Physics-informed neural particle flow for the Bayesian update step
Machine Learning

[2602.23089] Physics-informed neural particle flow for the Bayesian update step

This paper introduces a physics-informed neural particle flow method for the Bayesian update step, addressing computational challenges in...

arXiv - Machine Learning · 4 min ·
[2602.23318] Generalized Rapid Action Value Estimation in Memory-Constrained Environments
Data Science

[2602.23318] Generalized Rapid Action Value Estimation in Memory-Constrained Environments

The paper presents GRAVE2, GRAVER, and GRAVER2, enhanced algorithms for Generalized Rapid Action Value Estimation, addressing memory cons...

arXiv - AI · 3 min ·
[2602.23060] RhythmBERT: A Self-Supervised Language Model Based on Latent Representations of ECG Waveforms for Heart Disease Detection
Llms

[2602.23060] RhythmBERT: A Self-Supervised Language Model Based on Latent Representations of ECG Waveforms for Heart Disease Detection

RhythmBERT is a novel self-supervised language model designed for ECG waveform analysis, enhancing heart disease detection by treating EC...

arXiv - Machine Learning · 4 min ·
Previous Page 34 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime