Data Science

Data analysis, statistics, and data engineering

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[P] citracer: a small CLI tool to trace where a concept comes from in a citation graph

Hi all, I made a small tool that I've been using for my own literature reviews and figured I'd share in case it's useful to anyone else. ...

Reddit - Machine Learning · 1 min · 39 minutes ago

Data Science

What actually makes something the best AI meeting recorder?

I’ve been trying a few meeting tools lately and realized I care way less about flashy summaries than I thought. What I actually want is p...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

[D] The Bitter Lesson of Optimization: Why training Neural Networks to update themselves is mathematically brutal (but probably inevitable)

Are we still stuck in the "feature engineering" era of optimization? We trust neural networks to learn unimaginably complex patterns from...

Reddit - Machine Learning · 1 min · about 4 hours ago

All Content

Data Science

[2602.15968] From Reflection to Repair: A Scoping Review of Dataset Documentation Tools

This article presents a scoping review of dataset documentation tools, analyzing motivations behind their design and factors affecting th...

arXiv - AI · 4 min · about 2 months ago

Data Science

[2602.15958] DocSplit: A Comprehensive Benchmark Dataset and Evaluation Approach for Document Packet Recognition and Splitting

The paper introduces DocSplit, a benchmark dataset and evaluation framework for document packet recognition and splitting, addressing cha...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16709] Knowledge-Embedded Latent Projection for Robust Representation Learning

This article presents a novel knowledge-embedded latent projection model aimed at improving representation learning in high-dimensional d...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.15923] A fully differentiable framework for training proxy Exchange Correlation Functionals for periodic systems

This paper presents a fully differentiable framework for integrating machine learning models into Density Functional Theory (DFT) for per...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.16698] Causality is Key for Interpretability Claims to Generalise

This paper discusses the importance of causality in interpretability research for large language models, highlighting pitfalls in general...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.15919] Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability

The paper presents a method for assessing privacy vulnerability in machine learning models using a generalized leverage score, enabling e...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.16697] Protecting the Undeleted in Machine Unlearning

The paper discusses machine unlearning, focusing on the privacy risks associated with undeleted data when specific data points are remove...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.16684] Retrieval-Augmented Foundation Models for Matched Molecular Pair Transformations to Recapitulate Medicinal Chemistry Intuition

This article presents a novel approach using retrieval-augmented foundation models for matched molecular pair transformations, enhancing ...

arXiv - Machine Learning · 3 min · about 2 months ago

Data Science

[2602.16673] Neighborhood Stability as a Measure of Nearest Neighbor Searchability

The paper introduces two measures for assessing the searchability of datasets in clustering-based Approximate Nearest Neighbor Search (AN...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.15909] Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis

The paper presents Resp-Agent, an innovative agent-based system for generating multimodal respiratory sounds and diagnosing diseases, add...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16643] Factorization Machine with Quadratic-Optimization Annealing for RNA Inverse Folding and Evaluation of Binary-Integer Encoding and Nucleotide Assignment

This article presents a novel method using factorization machines with quadratic-optimization annealing (FMQA) to tackle the RNA inverse ...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16600] Predicting The Cop Number Using Machine Learning

This article explores the use of machine learning to predict the cop number in graph theory, demonstrating the effectiveness of classical...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.15890] Surrogate Modeling for Neutron Transport: A Neural Operator Approach

This article presents a neural operator framework for surrogate modeling in neutron transport, demonstrating significant computational ef...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16596] Sequential Membership Inference Attacks

The paper presents a novel approach to Membership Inference Attacks (MIAs) by developing an optimal attack strategy, SeMI*, leveraging mo...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16579] AIFL: A Global Daily Streamflow Forecasting Model Using Deterministic LSTM Pre-trained on ERA5-Land and Fine-tuned on IFS

The paper presents AIFL, a deterministic LSTM model for global daily streamflow forecasting, trained on ERA5-Land and fine-tuned on IFS, ...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16573] MoDE-Boost: Boosting Shared Mobility Demand with Edge-Ready Prediction Models

The paper presents MoDE-Boost, a novel approach using gradient boosting models to forecast urban mobility demand, enhancing efficiency in...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16570] Steering diffusion models with quadratic rewards: a fine-grained analysis

This article presents a detailed analysis of sampling from reward-tilted diffusion models, focusing on quadratic rewards and their comput...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16531] Transfer Learning of Linear Regression with Multiple Pretrained Models: Benefiting from More Pretrained Models via Overparameterization Debiasing

This paper explores transfer learning in linear regression using multiple pretrained models, highlighting the benefits of overparameteriz...

arXiv - Machine Learning · 3 min · about 2 months ago

Nlp

[2602.15866] NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey

This survey presents the NLP-PRISM framework for identifying privacy risks in social media NLP applications, analyzing 203 peer-reviewed ...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16530] FEKAN: Feature-Enriched Kolmogorov-Arnold Networks

The paper introduces Feature-Enriched Kolmogorov-Arnold Networks (FEKAN), an advanced model that enhances computational efficiency and pr...

arXiv - Machine Learning · 4 min · about 2 months ago

Previous Page 113 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Data Science

Top This Week

[P] citracer: a small CLI tool to trace where a concept comes from in a citation graph

What actually makes something the best AI meeting recorder?

[D] The Bitter Lesson of Optimization: Why training Neural Networks to update themselves is mathematically brutal (but probably inevitable)

All Content

[2602.15968] From Reflection to Repair: A Scoping Review of Dataset Documentation Tools

[2602.15958] DocSplit: A Comprehensive Benchmark Dataset and Evaluation Approach for Document Packet Recognition and Splitting

[2602.16709] Knowledge-Embedded Latent Projection for Robust Representation Learning

[2602.15923] A fully differentiable framework for training proxy Exchange Correlation Functionals for periodic systems

[2602.16698] Causality is Key for Interpretability Claims to Generalise

[2602.15919] Generalized Leverage Score for Scalable Assessment of Privacy Vulnerability

[2602.16697] Protecting the Undeleted in Machine Unlearning

[2602.16684] Retrieval-Augmented Foundation Models for Matched Molecular Pair Transformations to Recapitulate Medicinal Chemistry Intuition

[2602.16673] Neighborhood Stability as a Measure of Nearest Neighbor Searchability

[2602.15909] Resp-Agent: An Agent-Based System for Multimodal Respiratory Sound Generation and Disease Diagnosis

[2602.16643] Factorization Machine with Quadratic-Optimization Annealing for RNA Inverse Folding and Evaluation of Binary-Integer Encoding and Nucleotide Assignment

[2602.16600] Predicting The Cop Number Using Machine Learning

[2602.15890] Surrogate Modeling for Neutron Transport: A Neural Operator Approach

[2602.16596] Sequential Membership Inference Attacks

[2602.16579] AIFL: A Global Daily Streamflow Forecasting Model Using Deterministic LSTM Pre-trained on ERA5-Land and Fine-tuned on IFS

[2602.16573] MoDE-Boost: Boosting Shared Mobility Demand with Edge-Ready Prediction Models

[2602.16570] Steering diffusion models with quadratic rewards: a fine-grained analysis

[2602.16531] Transfer Learning of Linear Regression with Multiple Pretrained Models: Benefiting from More Pretrained Models via Overparameterization Debiasing

[2602.15866] NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey

[2602.16530] FEKAN: Feature-Enriched Kolmogorov-Arnold Networks

Related Topics

Stay updated with AI News