Data Science

Data analysis, statistics, and data engineering

Top This Week

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch
Machine Learning

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch

The company turns footage from robots into structured, searchable datasets with a deep learning model.

TechCrunch - AI · 6 min ·
Machine Learning

[R] VLMs Behavior for Long Video Understanding

I have extensively searched on long video understanding datasets such as Video-MME, MLVU, VideoBench, LongVideoBench and etc. What I have...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·

All Content

Nature Awards launches “AI for Discovery” research prize
Ai Startups

Nature Awards launches “AI for Discovery” research prize

Nature Awards has launched the 'AI for Discovery' prize to honor research teams utilizing AI and machine learning to address global chall...

AI News - General · 2 min ·
The Indian women trawling the worst of the internet to train AI
Ai Safety

The Indian women trawling the worst of the internet to train AI

The article explores the growing trend of Indian women working as data annotators for AI, highlighting the psychological toll of moderati...

AI Tools & Products · 4 min ·
[2602.08786] Empirically Understanding the Value of Prediction in Allocation
Data Science

[2602.08786] Empirically Understanding the Value of Prediction in Allocation

This paper explores the empirical value of prediction in resource allocation, comparing it to other investments like capacity expansion a...

arXiv - Machine Learning · 3 min ·
[2510.22049] Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders
Llms

[2510.22049] Massive Memorization with Hundreds of Trillions of Parameters for Sequential Transducer Generative Recommenders

This paper presents VISTA, a novel two-stage modeling framework for generative recommenders that enhances scalability by summarizing user...

arXiv - Machine Learning · 4 min ·
[2510.21686] Multimodal Datasets with Controllable Mutual Information
Machine Learning

[2510.21686] Multimodal Datasets with Controllable Mutual Information

This paper presents a framework for generating multimodal datasets with controllable mutual information, enhancing the study of mutual in...

arXiv - Machine Learning · 3 min ·
[2510.11789] Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
Machine Learning

[2510.11789] Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

This paper examines the convergence rates for learning pairwise interactions in attention-style models, demonstrating a minimax rate that...

arXiv - Machine Learning · 3 min ·
[2509.11517] PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation
Llms

[2509.11517] PeruMedQA: Benchmarking Large Language Models (LLMs) on Peruvian Medical Exams -- Dataset Construction and Evaluation

The PeruMedQA study evaluates large language models (LLMs) on Peruvian medical exams, creating a specialized dataset and demonstrating th...

arXiv - Machine Learning · 4 min ·
[2504.17203] High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services
Machine Learning

[2504.17203] High-Fidelity And Complex Test Data Generation For Google SQL Code Generation Services

This paper presents a method for generating high-fidelity test data for SQL code generation services, addressing limitations of tradition...

arXiv - Machine Learning · 4 min ·
[2501.12032] Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow
Machine Learning

[2501.12032] Accelerating Recommender Model ETL with a Streaming FPGA-GPU Dataflow

The paper presents PipeRec, a hardware-accelerated ETL engine designed to enhance the efficiency of recommender model training by integra...

arXiv - Machine Learning · 4 min ·
[2404.07849] Overparameterized Multiple Linear Regression as Hyper-Curve Fitting
Machine Learning

[2404.07849] Overparameterized Multiple Linear Regression as Hyper-Curve Fitting

This paper explores the mathematical equivalence of overparameterized multiple linear regression (MLR) to hyper-curve fitting, demonstrat...

arXiv - Machine Learning · 3 min ·
[2312.16307] Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration
Machine Learning

[2312.16307] Incentive-Aware Synthetic Control: Accurate Counterfactual Estimation via Incentivized Exploration

The paper presents a novel approach to synthetic control methods by addressing the overlap assumption in treatment effect estimation, pro...

arXiv - Machine Learning · 4 min ·
[2211.02003] Private Blind Model Averaging - Distributed, Non-interactive, and Convergent
Machine Learning

[2211.02003] Private Blind Model Averaging - Distributed, Non-interactive, and Convergent

This paper presents Private Blind Model Averaging, a method for distributed, non-interactive, and convergent learning that enhances priva...

arXiv - Machine Learning · 4 min ·
[2205.12377] Hardness of Maximum Likelihood Learning of DPPs
Machine Learning

[2205.12377] Hardness of Maximum Likelihood Learning of DPPs

This article presents a proof of the NP-completeness of the maximum likelihood learning problem for Determinantal Point Processes (DPPs),...

arXiv - Machine Learning · 4 min ·
[2602.11020] When Fusion Helps and When It Breaks: View-Aligned Robustness in Same-Source Financial Imaging
Ai Startups

[2602.11020] When Fusion Helps and When It Breaks: View-Aligned Robustness in Same-Source Financial Imaging

This paper explores the robustness of same-source multi-view learning in financial imaging, focusing on the effectiveness of early versus...

arXiv - Machine Learning · 3 min ·
[2601.21331] Convex Loss Functions for Support Vector Machines (SVMs) and Neural Networks
Machine Learning

[2601.21331] Convex Loss Functions for Support Vector Machines (SVMs) and Neural Networks

This paper introduces a new convex loss function for Support Vector Machines (SVMs) and neural networks, demonstrating improved performan...

arXiv - Machine Learning · 4 min ·
[2601.13780] Principled Latent Diffusion for Graphs via Laplacian Autoencoders
Machine Learning

[2601.13780] Principled Latent Diffusion for Graphs via Laplacian Autoencoders

This paper presents LG-Flow, a novel latent graph diffusion framework that enhances graph generation efficiency by compressing graphs int...

arXiv - Machine Learning · 4 min ·
[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering
Nlp

[2512.02435] Efficient Cross-Domain Offline Reinforcement Learning with Dynamics- and Value-Aligned Data Filtering

This paper presents a novel framework for cross-domain offline reinforcement learning, introducing a method that filters data based on bo...

arXiv - Machine Learning · 4 min ·
[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack
Machine Learning

[2510.10625] ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

The paper introduces ImpMIA, a novel Membership Inference Attack that leverages implicit bias in neural networks to identify training sam...

arXiv - Machine Learning · 4 min ·
[2510.04091] Rethinking Consistent Multi-Label Classification Under Inexact Supervision
Machine Learning

[2510.04091] Rethinking Consistent Multi-Label Classification Under Inexact Supervision

This paper presents a novel approach to multi-label classification under inexact supervision, addressing the limitations of existing meth...

arXiv - Machine Learning · 4 min ·
[2510.02823] The Curious Case of In-Training Compression of State Space Models
Machine Learning

[2510.02823] The Curious Case of In-Training Compression of State Space Models

This paper explores in-training compression techniques for State Space Models (SSMs), demonstrating how selective dimension preservation ...

arXiv - Machine Learning · 4 min ·
Previous Page 43 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime