Data Science

Data analysis, statistics, and data engineering

Top This Week

[2604.01619] Automatic Image-Level Morphological Trait Annotation for Organismal Images
Data Science

[2604.01619] Automatic Image-Level Morphological Trait Annotation for Organismal Images

Abstract page for arXiv paper 2604.01619: Automatic Image-Level Morphological Trait Annotation for Organismal Images

arXiv - AI · 4 min ·
[2510.25241] One-shot Adaptation of Humanoid Whole-body Motion with Walking Priors
Machine Learning

[2510.25241] One-shot Adaptation of Humanoid Whole-body Motion with Walking Priors

Abstract page for arXiv paper 2510.25241: One-shot Adaptation of Humanoid Whole-body Motion with Walking Priors

arXiv - AI · 3 min ·
[2509.17766] A State-Update Prompting Strategy for Efficient and Robust Multi-turn Dialogue
Llms

[2509.17766] A State-Update Prompting Strategy for Efficient and Robust Multi-turn Dialogue

Abstract page for arXiv paper 2509.17766: A State-Update Prompting Strategy for Efficient and Robust Multi-turn Dialogue

arXiv - AI · 3 min ·

All Content

Data Science

[P] V2 of a PaperWithCode alternative - Wizwand

Wizwand, an alternative to PaperWithCode, has launched its second version, addressing dataset inconsistencies and improving leaderboard a...

Reddit - Machine Learning · 1 min ·
Machine Learning

[R] The "Data Scientist" title is the worst paying title in ML (EMEA).

A recruiter reveals that 'Data Scientist' is the lowest-paying title in machine learning across Europe, based on an analysis of over 350,...

Reddit - Machine Learning · 1 min ·
Ai Infrastructure

[P] CUDA scan kernels: hierarchical vs single-pass, decoupled lookbacks

This article explores efficient implementations of scan/prefix-sum algorithms on GPUs, comparing hierarchical and single-pass methods, an...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] Open Source Fraud Detection System handling 0.17% class imbalance with Random Forest

This article discusses the development of an open-source credit card fraud detection system utilizing Random Forest to address class imba...

Reddit - Machine Learning · 1 min ·
「データ不足」の壁を越える:合成ペルソナが日本のAI開発を加速
Open Source Ai

「データ不足」の壁を越える:合成ペルソナが日本のAI開発を加速

The article discusses how synthetic personas can help overcome data scarcity in AI development in Japan, showcasing NTT DATA's innovative...

Hugging Face Blog · 2 min ·
Microsoft has a new plan to prove what’s real and what’s AI online | MIT Technology Review
Data Science

Microsoft has a new plan to prove what’s real and what’s AI online | MIT Technology Review

A new proposal calls on social media and AI companies to adopt strict verification, but the company hasn’t committed to following its own...

MIT Technology Review - AI · 9 min ·
It’s MAGA v Broligarch in the battle over prediction markets | The Verge
Ai Startups

It’s MAGA v Broligarch in the battle over prediction markets | The Verge

The article discusses the conflict between MAGA supporters and the Broligarch faction over the regulation of prediction markets, highligh...

The Verge - AI · 9 min ·
Machine Learning

[D] Research on Self-supervised fine tunning of "sentence" embeddings?

The article discusses the challenges and methods of fine-tuning sentence embeddings from transformer models, particularly focusing on agg...

Reddit - Machine Learning · 1 min ·
Machine Learning

[R] Analysis of 350+ ML competitions in 2025

The article analyzes over 350 machine learning competitions from 2025, highlighting trends, winning solutions, and insights from various ...

Reddit - Machine Learning · 1 min ·
Billionaire Stanley Druckenmiller Sells Meta Platforms Stock and Buys an AI Stock Up 210,000% Since Its IPO
Ai Startups

Billionaire Stanley Druckenmiller Sells Meta Platforms Stock and Buys an AI Stock Up 210,000% Since Its IPO

Billionaire Stanley Druckenmiller sold his entire stake in Meta Platforms and invested in Amazon, highlighting shifts in AI investment st...

AI News - General · 6 min ·
Machine Learning

[D] Which hyperparameters search library to use?

The article discusses various hyperparameter optimization libraries in machine learning, including hyperopt, Optuna, sklearn.GridSearchCV...

Reddit - Machine Learning · 1 min ·
Machine Learning

[p] I Made my first Transformer architecture code

A Reddit user shares their first implementation of a Transformer architecture using PyTorch, detailing the structure and parameters used,...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] 1T performance from a 397B model. How?

The article discusses the performance of a 397 billion parameter model, questioning whether its success is due to architectural advanceme...

Reddit - Machine Learning · 1 min ·
[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources
Machine Learning

[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

This paper explores the dynamics of iterative training on contaminated data sources, demonstrating that model performance can improve des...

arXiv - Machine Learning · 4 min ·
[2601.21093] High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models
Machine Learning

[2601.21093] High-dimensional learning dynamics of multi-pass Stochastic Gradient Descent in multi-index models

This paper explores the learning dynamics of multi-pass Stochastic Gradient Descent (SGD) in high-dimensional multi-index models, providi...

arXiv - Machine Learning · 4 min ·
[2601.17973] Boosting methods for interval-censored data with regression and classification
Machine Learning

[2601.17973] Boosting methods for interval-censored data with regression and classification

This article presents novel nonparametric boosting methods tailored for interval-censored data, enhancing regression and classification t...

arXiv - Machine Learning · 4 min ·
[2512.13532] Adaptive Sampling for Hydrodynamic Stability
Machine Learning

[2512.13532] Adaptive Sampling for Hydrodynamic Stability

This article presents an adaptive sampling method for efficiently detecting bifurcation boundaries in fluid flow problems, enhancing the ...

arXiv - Machine Learning · 4 min ·
[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs
Llms

[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs

The paper introduces Randomized Masked Finetuning (RMFT), a technique designed to reduce the memorization of personally identifiable info...

arXiv - Machine Learning · 3 min ·
[2512.09530] Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport
Machine Learning

[2512.09530] Transformers for Tabular Data: A Training Perspective of Self-Attention via Optimal Transport

This paper explores self-attention training for tabular data using Optimal Transport (OT), presenting a novel OT-based algorithm that enh...

arXiv - Machine Learning · 4 min ·
[2511.19879] Learning Degenerate Manifolds of Frustrated Magnets with Boltzmann Machines
Machine Learning

[2511.19879] Learning Degenerate Manifolds of Frustrated Magnets with Boltzmann Machines

This paper explores the use of Restricted Boltzmann Machines (RBMs) to model spin configurations in frustrated magnets, demonstrating the...

arXiv - Machine Learning · 3 min ·
Previous Page 107 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime