Data Science

Data analysis, statistics, and data engineering

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Machine Learning

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch

The company turns footage from robots into structured, searchable datasets with a deep learning model.

TechCrunch - AI · 6 min · about 5 hours ago

Machine Learning

[R] VLMs Behavior for Long Video Understanding

I have extensively searched on long video understanding datasets such as Video-MME, MLVU, VideoBench, LongVideoBench and etc. What I have...

Reddit - Machine Learning · 1 min · about 10 hours ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 11 hours ago

All Content

Ai Startups

Nature Awards launches “AI for Discovery” research prize

Nature Awards has launched the 'AI for Discovery' prize to honor research teams utilizing AI and machine learning to address global chall...

AI News - General · 2 min · about 1 month ago

Ai Safety

The Indian women trawling the worst of the internet to train AI

The article explores the growing trend of Indian women working as data annotators for AI, highlighting the psychological toll of moderati...

AI Tools & Products · 4 min · about 1 month ago

Data Science

[2602.08786] Empirically Understanding the Value of Prediction in Allocation

This paper explores the empirical value of prediction in resource allocation, comparing it to other investments like capacity expansion a...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2510.22049] Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

This paper presents VISTA, a novel two-stage modeling framework for generative recommenders that enhances scalability by summarizing user...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.21686] Multimodal Datasets with Controllable Mutual Information

This paper presents a framework for generating multimodal datasets with controllable mutual information, enhancing the study of mutual in...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2510.11789] Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

This paper examines the convergence rates for learning pairwise interactions in attention-style models, demonstrating a minimax rate that...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2509.11517] PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation

The PeruMedQA study evaluates large language models (LLMs) on Peruvian medical exams, creating a specialized dataset and demonstrating th...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2504.17203] High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services

This paper presents a method for generating high-fidelity test data for SQL code generation services, addressing limitations of tradition...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2501.12032] Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow

The paper presents PipeRec, a hardware-accelerated ETL engine designed to enhance the efficiency of recommender model training by integra...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2404.07849] Overparameterized Multiple Linear Regression as Hyper-Curve Fitting

This paper explores the mathematical equivalence of overparameterized multiple linear regression (MLR) to hyper-curve fitting, demonstrat...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2312.16307] Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration

The paper presents a novel approach to synthetic control methods by addressing the overlap assumption in treatment effect estimation, pro...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2211.02003] Private Blind Model Averaging - Distributed, Non-interactive, and Convergent

This paper presents Private Blind Model Averaging, a method for distributed, non-interactive, and convergent learning that enhances priva...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2205.12377] Hardness of Maximum Likelihood Learning of DPPs

This article presents a proof of the NP-completeness of the maximum likelihood learning problem for Determinantal Point Processes (DPPs),...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Startups

[2602.11020] When Fusion Helps and When It Breaks: View-Aligned Robustness in Same-Source Financial Imaging

This paper explores the robustness of same-source multi-view learning in financial imaging, focusing on the effectiveness of early versus...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2601.21331] Convex Loss Functions for Support Vector Machines (SVMs) and Neural Networks

This paper introduces a new convex loss function for Support Vector Machines (SVMs) and neural networks, demonstrating improved performan...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2601.13780] Principled Latent Diffusion for Graphs via Laplacian Autoencoders

This paper presents LG-Flow, a novel latent graph diffusion framework that enhances graph generation efficiency by compressing graphs int...

arXiv - Machine Learning · 4 min · about 1 month ago

Nlp

[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

This paper presents a novel framework for cross-domain offline reinforcement learning, introducing a method that filters data based on bo...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

The paper introduces ImpMIA, a novel Membership Inference Attack that leverages implicit bias in neural networks to identify training sam...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.04091] Rethinking Consistent Multi-Label Classification Under Inexact Supervision

This paper presents a novel approach to multi-label classification under inexact supervision, addressing the limitations of existing meth...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.02823] The Curious Case of In-Training Compression of State Space Models

This paper explores in-training compression techniques for State Space Models (SSMs), demonstrating how selective dimension preservation ...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 43 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Data Science

Top This Week

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch

[R] VLMs Behavior for Long Video Understanding

UMKC Announces New Master of Science in Artificial Intelligence

All Content

Nature Awards launches “AI for Discovery” research prize

The Indian women trawling the worst of the internet to train AI

[2602.08786] Empirically Understanding the Value of Prediction in Allocation

[2510.22049] Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

[2510.21686] Multimodal Datasets with Controllable Mutual Information

[2510.11789] Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

[2509.11517] PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation

[2504.17203] High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services

[2501.12032] Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow

[2404.07849] Overparameterized Multiple Linear Regression as Hyper-Curve Fitting

[2312.16307] Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration

[2211.02003] Private Blind Model Averaging - Distributed, Non-interactive, and Convergent

[2205.12377] Hardness of Maximum Likelihood Learning of DPPs

[2602.11020] When Fusion Helps and When It Breaks: View-Aligned Robustness in Same-Source Financial Imaging

[2601.21331] Convex Loss Functions for Support Vector Machines (SVMs) and Neural Networks

[2601.13780] Principled Latent Diffusion for Graphs via Laplacian Autoencoders

[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

[2510.04091] Rethinking Consistent Multi-Label Classification Under Inexact Supervision

[2510.02823] The Curious Case of In-Training Compression of State Space Models

Related Topics

Stay updated with AI News