Data Science

Data analysis, statistics, and data engineering

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Nlp

Has anyone here switched to TeraBox recently? Is it actually worth it?

I’ve been seeing more people talk about TeraBox lately, especially around storage for AI-related workflows. Curious if anyone here has us...

Reddit - Artificial Intelligence · 1 min ·
Google quietly launched an AI dictation app that works offline
Machine Learning

Google quietly launched an AI dictation app that works offline

Google's new offline-first dictation app uses Gemma AI models to take on the apps like Wispr Flow.

TechCrunch - AI · 4 min ·

All Content

[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources
Machine Learning

[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

This paper explores the dynamics of iterative training on contaminated data sources, demonstrating that model performance can improve des...

arXiv - Machine Learning · 4 min ·
[2601.21093] High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models
Machine Learning

[2601.21093] High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

This paper explores the learning dynamics of multi-pass Stochastic Gradient Descent (SGD) in high-dimensional multi-index models, providi...

arXiv - Machine Learning · 4 min ·
[2601.17973] Boosting methods for interval-censored data with regression and classification
Machine Learning

[2601.17973] Boosting methods for interval-censored data with regression and classification

This article presents novel nonparametric boosting methods tailored for interval-censored data, enhancing regression and classification t...

arXiv - Machine Learning · 4 min ·
[2512.13532] Adaptive Sampling for Hydrodynamic Stability
Machine Learning

[2512.13532] Adaptive Sampling for Hydrodynamic Stability

This article presents an adaptive sampling method for efficiently detecting bifurcation boundaries in fluid flow problems, enhancing the ...

arXiv - Machine Learning · 4 min ·
[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs
Llms

[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs

The paper introduces Randomized Masked Finetuning (RMFT), a technique designed to reduce the memorization of personally identifiable info...

arXiv - Machine Learning · 3 min ·
[2512.09530] Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport
Machine Learning

[2512.09530] Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport

This paper explores self-attention training for tabular data using Optimal Transport (OT), presenting a novel OT-based algorithm that enh...

arXiv - Machine Learning · 4 min ·
[2511.19879] Learning Degenerate Manifolds of Frustrated Magnets with Boltzmann Machines
Machine Learning

[2511.19879] Learning Degenerate Manifolds of Frustrated Magnets with Boltzmann Machines

This paper explores the use of Restricted Boltzmann Machines (RBMs) to model spin configurations in frustrated magnets, demonstrating the...

arXiv - Machine Learning · 3 min ·
[2511.14147] Imaging with super-resolution in changing random media
Data Science

[2511.14147] Imaging with super-resolution in changing random media

This article presents a novel imaging algorithm that utilizes strong scattering to achieve super-resolution in dynamic random media, enha...

arXiv - Machine Learning · 3 min ·
[2511.17772] Weighted Birkhoff Averages Accelerate Data-Driven Methods
Machine Learning

[2511.17772] Weighted Birkhoff Averages Accelerate Data-Driven Methods

The paper discusses Weighted Birkhoff Averages, a method that accelerates convergence in data-driven algorithms for dynamical systems, de...

arXiv - Machine Learning · 3 min ·
[2511.04681] Dark Energy Survey Year 3 results: Simulation-based $w$CDM inference from weak lensing and galaxy clustering maps with deep learning: Analysis design
Machine Learning

[2511.04681] Dark Energy Survey Year 3 results: Simulation-based $w$CDM inference from weak lensing and galaxy clustering maps with deep learning: Analysis design

This article presents a novel simulation-based inference pipeline utilizing deep learning to analyze weak lensing and galaxy clustering m...

arXiv - Machine Learning · 5 min ·
[2511.03952] High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes
Machine Learning

[2511.03952] High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes

This paper presents high-dimensional limit theorems for Stochastic Gradient Descent (SGD) with Polyak Momentum and adaptive step-sizes, c...

arXiv - Machine Learning · 4 min ·
[2509.20345] Statistical Inference Leveraging Synthetic Data with Distribution-Free Guarantees
Machine Learning

[2509.20345] Statistical Inference Leveraging Synthetic Data with Distribution-Free Guarantees

This article presents the GEneral Synthetic-Powered Inference (GESPI) framework, which enhances statistical inference by integrating synt...

arXiv - Machine Learning · 4 min ·
[2504.13519] Filter2Noise: A Framework for Interpretable and Zero-Shot Low-Dose CT Image Denoising
Machine Learning

[2504.13519] Filter2Noise: A Framework for Interpretable and Zero-Shot Low-Dose CT Image Denoising

The paper presents Filter2Noise, a novel framework for interpretable and zero-shot low-dose CT image denoising, achieving state-of-the-ar...

arXiv - Machine Learning · 4 min ·
[2503.20711] Demand Estimation with Text and Image Data
Machine Learning

[2503.20711] Demand Estimation with Text and Image Data

This article presents a novel demand estimation method that utilizes unstructured data from text and images to enhance substitution patte...

arXiv - Machine Learning · 3 min ·
[2502.20063] Strategic Hiring under Algorithmic Monoculture
Machine Learning

[2502.20063] Strategic Hiring under Algorithmic Monoculture

The paper explores strategic hiring in labor markets dominated by algorithmic evaluation, highlighting the inefficiencies of naive hiring...

arXiv - Machine Learning · 4 min ·
[2412.00364] LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation
Llms

[2412.00364] LMSeg: Unleashing the Power of Large-Scale Models for Open-Vocabulary Semantic Segmentation

The paper presents LMSeg, a novel approach for open-vocabulary semantic segmentation that enhances visual and linguistic feature alignmen...

arXiv - Machine Learning · 4 min ·
[2412.10537] VerifiableFL: Verifiable Claims for Federated Learning using Exclaves
Machine Learning

[2412.10537] VerifiableFL: Verifiable Claims for Federated Learning using Exclaves

The paper presents VerifiableFL, a system for federated learning that ensures verifiable claims about model training using exclaves, enha...

arXiv - Machine Learning · 4 min ·
[2512.00036] Refined Bayesian Optimization for Efficient Beam Alignment in Intelligent Indoor Wireless Environments
Machine Learning

[2512.00036] Refined Bayesian Optimization for Efficient Beam Alignment in Intelligent Indoor Wireless Environments

This article presents a refined Bayesian optimization framework for efficient beam alignment in intelligent indoor wireless environments,...

arXiv - AI · 4 min ·
[2510.12768] Uncertainty Matters in Dynamic Gaussian Splatting for Monocular 4D Reconstruction
Machine Learning

[2510.12768] Uncertainty Matters in Dynamic Gaussian Splatting for Monocular 4D Reconstruction

This paper presents USplat4D, a novel framework for monocular 4D reconstruction that incorporates uncertainty in dynamic Gaussian splatti...

arXiv - AI · 4 min ·
[2602.11320] Efficient Analysis of the Distilled Neural Tangent Kernel
Data Science

[2602.11320] Efficient Analysis of the Distilled Neural Tangent Kernel

The paper presents a novel approach to reduce the computational complexity of Neural Tangent Kernel (NTK) methods through dataset distill...

arXiv - Machine Learning · 3 min ·
Previous Page 106 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime