[2507.11891] Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

[2507.11891] Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?

arXiv - Machine Learning 4 min read Article

Summary

This paper explores the impact of data sharing on A/B experiments in recommendation systems, focusing on how interference affects algorithm performance evaluation under a multi-armed bandit framework.

Why It Matters

Understanding the implications of data sharing in A/B testing is crucial for practitioners in machine learning and recommendation systems. This research addresses potential biases in algorithm comparisons, providing insights that can enhance decision-making processes in real-world applications.

Key Takeaways

  • Data sharing can lead to biased estimates in A/B experiments.
  • The stable unit treatment value assumption (SUTVA) may not hold in large-scale systems.
  • The level of exploration versus exploitation is critical in algorithm evaluation.
  • A detection procedure based on ramp-up experiments can identify incorrect comparisons.
  • Understanding interference is essential for accurate algorithm performance assessment.

Statistics > Machine Learning arXiv:2507.11891 (stat) [Submitted on 16 Jul 2025 (v1), last revised 23 Feb 2026 (this version, v2)] Title:Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work? Authors:Shuangning Li, Chonghuan Wang, Jingyan Wang View a PDF of the paper titled Choosing the Better Bandit Algorithm under Data Sharing: When Do A/B Experiments Work?, by Shuangning Li and 2 other authors View PDF HTML (experimental) Abstract:We study A/B experiments that are designed to compare the performance of two recommendation algorithms. Prior work has observed that the stable unit treatment value assumption (SUTVA) often does not hold in large-scale recommendation systems, and hence the estimate for the global treatment effect (GTE) is biased. Specifically, units under the treatment and control algorithms contribute to a shared pool of data that subsequently train both algorithms, resulting in interference between the two groups. In this paper, we investigate when such interference may affect our decision making on which algorithm is better. We formalize this insight under a multi-armed bandit framework and theoretically characterize when the sign of the difference-in-means estimator of the GTE under data sharing aligns with or contradicts the sign of the true GTE. Our analysis identifies the level of exploration versus exploitation as a key determinant of how data sharing impacts decision making, and we propose a detection procedure based on...

Related Articles

Ai Safety

NHS staff resist using Palantir software. Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much

submitted by /u/esporx [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...

Reddit - Artificial Intelligence · 1 min ·
Computer Vision

House Democrat Questions Anthropic on AI Safety After Source Code Leak

Rep. Josh Gottheimer, who is generally tough on China, just sent a letter to Anthropic questioning their decision to reduce certain safet...

Reddit - Artificial Intelligence · 1 min ·
[2512.21106] Semantic Refinement with LLMs for Graph Representations
Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min ·
More in Ai Safety: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime