[2604.04394] Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games

[2604.04394] Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2604.04394: Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games

Computer Science > Machine Learning arXiv:2604.04394 (cs) [Submitted on 6 Apr 2026] Title:Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games Authors:Narim Jeong, Donghwan Lee View a PDF of the paper titled Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games, by Narim Jeong and 1 other authors View PDF HTML (experimental) Abstract:Reinforcement learning has been successful both empirically and theoretically in single-agent settings, but extending these results to multi-agent reinforcement learning in general-sum Markov games remains challenging. This paper studies the convergence of Stackelberg Q-value iteration in two-player general-sum Markov games from a control-theoretic perspective. We introduce a relaxed policy condition tailored to the Stackelberg setting and model the learning dynamics as a switching system. By constructing upper and lower comparison systems, we establish finite-time error bounds for the Q-functions and characterize their convergence properties. Our results provide a novel control-theoretic perspective on Stackelberg learning. Moreover, to the best of the authors' knowledge, this paper offers the first finite-time convergence guarantees for Q-value iteration in general-sum Markov games under Stackelberg interactions. Comments: Subjects: Machine Learning (cs.LG); Systems and Control (eess.SY) Cite as: arXiv:2604.04394 [cs.LG]   (or arXiv:2604.04394v1 [cs.LG] for this version)   https://doi.org/10.48550...

Originally published on April 07, 2026. Curated by AI News.

Related Articles

[2602.06869] Uncovering Cross-Objective Interference in Multi-Objective Alignment
Llms

[2602.06869] Uncovering Cross-Objective Interference in Multi-Objective Alignment

Abstract page for arXiv paper 2602.06869: Uncovering Cross-Objective Interference in Multi-Objective Alignment

arXiv - Machine Learning · 3 min ·
[2604.07401] Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal Dense Associative Memory
Machine Learning

[2604.07401] Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal Dense Associative Memory

Abstract page for arXiv paper 2604.07401: Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal Dense Associative Memory

arXiv - Machine Learning · 4 min ·
[2512.14954] Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation
Llms

[2512.14954] Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation

Abstract page for arXiv paper 2512.14954: Cross-Tokenizer Likelihood Scoring Algorithms for Language Model Distillation

arXiv - Machine Learning · 4 min ·
[2507.12768] AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation
Machine Learning

[2507.12768] AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

Abstract page for arXiv paper 2507.12768: AnyPos: Automated Task-Agnostic Actions for Bimanual Manipulation

arXiv - Machine Learning · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime