Data Science Guide
A comprehensive guide to the best data science resources, organized by type. Curated by AI News.
Researches
[2510.26722] Non-Convex Over-the-Air Heterogeneous Federated Learning: A Bias-Variance Trade-off
This paper explores the challenges of heterogeneous federated learning in wireless networks, focusing on the bias-variance trade-off in non-convex scenarios. It presents a novel...
[2602.12426] Interference-Robust Non-Coherent Over-the-Air Computation for Decentralized Optimization
This paper presents an interference-robust non-coherent over-the-air computation (IR-NCOTA) method for decentralized optimization, enhancing consensus estimation in wireless net...
[2602.15546] CEPAE: Conditional Entropy-Penalized Autoencoders for Time Series Counterfactuals
The paper introduces CEPAE, a novel approach utilizing Conditional Entropy-Penalized Autoencoders for effective counterfactual inference in time series data, demonstrating super...
[2602.15568] Scenario Approach with Post-Design Certification of User-Specified Properties
This paper introduces a scenario approach for post-design certification of user-specified properties, enhancing reliability without additional test datasets.
[R] Large-Scale Online Deanonymization with LLMs
This paper demonstrates how large language models (LLMs) can deanonymize users based on their online posts, achieving high precision across various platforms.
[2510.03313] Scaling Laws Revisited: Modeling the Role of Data Quality in Language Model Pretraining
The paper introduces a new dimensionless data-quality parameter for language model pretraining, establishing a quality-aware scaling law that predicts loss based on model size, ...
[2509.18949] Towards Privacy-Aware Bayesian Networks: A Credal Approach
This paper presents a novel approach to privacy-aware Bayesian networks using credal networks, addressing the trade-off between privacy and model utility in probabilistic graphi...
[2602.23321] Deep ensemble graph neural networks for probabilistic cosmic-ray direction and energy reconstruction in autonomous radio arrays
This paper presents a novel method using deep ensemble graph neural networks to accurately reconstruct the direction and energy of cosmic rays detected by autonomous radio arrays.
[2509.15429] Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data
This paper presents a Random Matrix Theory-guided approach to sparse PCA for single-cell RNA-seq data, enhancing dimensionality reduction and cell-type classification accuracy.
Articles
[2602.15277] Accelerating Large-Scale Dataset Distillation via Exploration-Exploitation Optimization
This paper presents Exploration-Exploitation Distillation (E^2D), a method for efficient large-scale dataset distillation that balances accuracy and computational efficiency, ac...
Ask HN: How can I pivot from software engineering back into neuroscience?
A software engineer seeks guidance on transitioning back to neuroscience, leveraging skills in data analysis and machine learning to contribute to the field.
[2602.15239] Size Transferability of Graph Transformers with Convolutional Positional Encodings
This paper explores the size transferability of Graph Transformers (GTs) with convolutional positional encodings, demonstrating their ability to generalize from small to larger ...
5 ways bp uses AI and other tech to drive performance
The article explores five innovative ways BP leverages AI and technology to enhance operational performance, showcasing their commitment to digital transformation.
[2602.10947] Computational Phenomenology of Temporal Experience in Autism: Quantifying the Emotional and Narrative Characteristics of Lived Unpredictability
This article explores the emotional and narrative characteristics of temporal experience in autistic individuals, highlighting the unpredictability they face and its impact on r...
The Small English Town Swept Up in the Global AI Arms Race | WIRED
Residents of Potters Bar protest against a planned data center on green belt land, highlighting tensions between AI infrastructure demands and local environmental concerns.
[2602.12605] Block-Sample MAC-Bayes Generalization Bounds
The paper introduces Block-Sample MAC-Bayes bounds, a new approach to generalization error estimation in machine learning, enhancing traditional PAC-Bayes bounds by focusing on ...
[2602.12613] Coden: Efficient Temporal Graph Neural Networks for Continuous Prediction
The paper introduces Coden, an efficient Temporal Graph Neural Network (TGNN) model designed for continuous predictions, overcoming limitations of existing TGNNs in dynamic grap...
Is alignment missing a dataset that no one has built yet?
The article discusses the absence of a dataset that captures the unique nuances of human identity, which are not reflected in existing language models, highlighting potential im...
[2602.15236] BindCLIP: A Unified Contrastive-Generative Representation Learning Framework for Virtual Screening
BindCLIP introduces a novel framework for virtual screening, enhancing ligand identification through a unified contrastive-generative learning approach that improves binding int...
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime