Data Science Guide

A comprehensive guide to the best data science resources, organized by type. Curated by AI News.

Researches

[2510.26722] Non-Convex Over-the-Air Heterogeneous Federated Learning: A Bias-Variance Trade-off

This paper explores the challenges of heterogeneous federated learning in wireless networks, focusing on the bias-variance trade-off in non-convex scenarios. It presents a novel...

arXiv - Machine Learning

[2602.12426] Interference-Robust Non-Coherent Over-the-Air Computation for Decentralized Optimization

This paper presents an interference-robust non-coherent over-the-air computation (IR-NCOTA) method for decentralized optimization, enhancing consensus estimation in wireless net...

arXiv - Machine Learning

[2602.15546] CEPAE: Conditional Entropy-Penalized Autoencoders for Time Series Counterfactuals

The paper introduces CEPAE, a novel approach utilizing Conditional Entropy-Penalized Autoencoders for effective counterfactual inference in time series data, demonstrating super...

arXiv - Machine Learning

[2602.15568] Scenario Approach with Post-Design Certification of User-Specified Properties

This paper introduces a scenario approach for post-design certification of user-specified properties, enhancing reliability without additional test datasets.

arXiv - Machine Learning

[R] Large-Scale Online Deanonymization with LLMs

This paper demonstrates how large language models (LLMs) can deanonymize users based on their online posts, achieving high precision across various platforms.

Reddit - Machine Learning

[2510.03313] Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining

The paper introduces a new dimensionless data-quality parameter for language model pretraining, establishing a quality-aware scaling law that predicts loss based on model size, ...

arXiv - Machine Learning

[2509.18949] Towards Privacy-Aware Bayesian Networks: A Credal Approach

This paper presents a novel approach to privacy-aware Bayesian networks using credal networks, addressing the trade-off between privacy and model utility in probabilistic graphi...

arXiv - AI

[2602.23321] Deep ensemble graph neural networks for probabilistic cosmic-ray direction and energy reconstruction in autonomous radio arrays

This paper presents a novel method using deep ensemble graph neural networks to accurately reconstruct the direction and energy of cosmic rays detected by autonomous radio arrays.

arXiv - Machine Learning

[2509.15429] Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

This paper presents a Random Matrix Theory-guided approach to sparse PCA for single-cell RNA-seq data, enhancing dimensionality reduction and cell-type classification accuracy.

arXiv - Machine Learning

Articles

[2602.15277] Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization

This paper presents Exploration-Exploitation Distillation (E^2D), a method for efficient large-scale dataset distillation that balances accuracy and computational efficiency, ac...

arXiv - Machine Learning

Ask HN: How can I pivot from software engineering back into neuroscience?

A software engineer seeks guidance on transitioning back to neuroscience, leveraging skills in data analysis and machine learning to contribute to the field.

Hacker News - AI

[2602.15239] Size Transferability of Graph Transformers with Convolutional Positional Encodings

This paper explores the size transferability of Graph Transformers (GTs) with convolutional positional encodings, demonstrating their ability to generalize from small to larger ...

arXiv - Machine Learning

5 ways bp uses AI and other tech to drive performance

The article explores five innovative ways BP leverages AI and technology to enhance operational performance, showcasing their commitment to digital transformation.

AI News - General

[2602.10947] Computational Phenomenology of Temporal Experience in Autism: Quantifying the Emotional and Narrative Characteristics of Lived Unpredictability

This article explores the emotional and narrative characteristics of temporal experience in autistic individuals, highlighting the unpredictability they face and its impact on r...

arXiv - AI

The Small English Town Swept Up in the Global AI Arms Race | WIRED

Residents of Potters Bar protest against a planned data center on green belt land, highlighting tensions between AI infrastructure demands and local environmental concerns.

Wired - AI

[2602.12605] Block-Sample MAC-Bayes Generalization Bounds

The paper introduces Block-Sample MAC-Bayes bounds, a new approach to generalization error estimation in machine learning, enhancing traditional PAC-Bayes bounds by focusing on ...

arXiv - Machine Learning

[2602.12613] Coden: Efficient Temporal Graph Neural Networks for Continuous Prediction

The paper introduces Coden, an efficient Temporal Graph Neural Network (TGNN) model designed for continuous predictions, overcoming limitations of existing TGNNs in dynamic grap...

arXiv - Machine Learning

Is alignment missing a dataset that no one has built yet?

The article discusses the absence of a dataset that captures the unique nuances of human identity, which are not reflected in existing language models, highlighting potential im...

Reddit - Artificial Intelligence

[2602.15236] BindCLIP: A Unified Contrastive-Generative Representation Learning Framework for Virtual Screening

BindCLIP introduces a novel framework for virtual screening, enhancing ligand identification through a unified contrastive-generative learning approach that improves binding int...

arXiv - Machine Learning

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime