[2601.11924] Communication-Corruption Coupling and Verification in Cooperative Multi-Objective Bandits
Summary
This paper explores cooperative multi-objective bandits under adversarial corruption, presenting a communication-corruption coupling that impacts team regret based on information sharing methods.
Why It Matters
Understanding the dynamics of communication and corruption in multi-agent systems is crucial for improving decision-making processes in environments where adversarial influences are present. This research provides insights into how different sharing protocols can mitigate corruption effects, which is vital for applications in machine learning and AI.
Key Takeaways
- The study introduces a communication-corruption coupling that affects team performance based on information sharing methods.
- Different sharing protocols (raw samples, sufficient statistics, recommendations) yield varying levels of corruption penalties.
- Verification of observations is critical in high-corruption regimes to restore learnability and reduce regret.
- The research establishes theoretical limits on performance penalties due to corruption, emphasizing the need for clean information.
- Centralized-rate team regret can be achieved through effective sharing strategies.
Computer Science > Machine Learning arXiv:2601.11924 (cs) [Submitted on 17 Jan 2026 (v1), last revised 20 Feb 2026 (this version, v2)] Title:Communication-Corruption Coupling and Verification in Cooperative Multi-Objective Bandits Authors:Ming Shi View a PDF of the paper titled Communication-Corruption Coupling and Verification in Cooperative Multi-Objective Bandits, by Ming Shi View PDF HTML (experimental) Abstract:We study cooperative stochastic multi-armed bandits with vector-valued rewards under adversarial corruption and limited verification. In each of $T$ rounds, each of $N$ agents selects an arm, the environment generates a clean reward vector, and an adversary perturbs the observed feedback subject to a global corruption budget $\Gamma$. Performance is measured by team regret under a coordinate-wise nondecreasing, $L$-Lipschitz scalarization $\phi$, covering linear, Chebyshev, and smooth monotone utilities. Our main contribution is a communication-corruption coupling: we show that a fixed environment-side budget $\Gamma$ can translate into an effective corruption level ranging from $\Gamma$ to $N\Gamma$, depending on whether agents share raw samples, sufficient statistics, or only arm recommendations. We formalize this via a protocol-induced multiplicity functional and prove regret bounds parameterized by the resulting effective corruption. As corollaries, raw-sample sharing can suffer an $N$-fold larger additive corruption penalty, whereas summary sharing and rec...