[2602.17737] Nested Training for Mutual Adaptation in Human-AI Teaming
Summary
This paper presents a novel nested training approach for enhancing mutual adaptation in human-AI teaming, addressing challenges in agent adaptability and performance in dynamic environments.
Why It Matters
As AI systems increasingly collaborate with humans, understanding and improving their adaptability is crucial for effective teamwork. This research provides insights into training methodologies that can enhance AI agents' performance in real-world scenarios where human behaviors are unpredictable.
Key Takeaways
- Introduces a nested training regime to improve human-AI adaptability.
- Models human-robot interactions as Interactive Partially Observable Markov Decision Processes (I-POMDP).
- Demonstrates improved task performance and adaptability in AI agents when paired with unseen adaptive partners.
Computer Science > Robotics arXiv:2602.17737 (cs) [Submitted on 18 Feb 2026] Title:Nested Training for Mutual Adaptation in Human-AI Teaming Authors:Upasana Biswas, Durgesh Kalwar, Subbarao Kambhampati, Sarath Sreedharan View a PDF of the paper titled Nested Training for Mutual Adaptation in Human-AI Teaming, by Upasana Biswas and 3 other authors View PDF HTML (experimental) Abstract:Mutual adaptation is a central challenge in human--AI teaming, as humans naturally adjust their strategies in response to a robot's policy. Existing approaches aim to improve diversity in training partners to approximate human behavior, but these partners are static and fail to capture adaptive behavior of humans. Exposing robots to adaptive behaviors is critical, yet when both agents learn simultaneously in a multi-agent setting, they often converge to opaque implicit coordination strategies that only work with the agents they were co-trained with. Such agents fail to generalize when paired with new partners. In order to capture the adaptive behavior of humans, we model the human-robot teaming scenario as an Interactive Partially Observable Markov Decision Process (I-POMDP), explicitly modeling human adaptation as part of the state. We propose a nested training regime to approximately learn the solution to a finite-level I-POMDP. In this framework, agents at each level are trained against adaptive agents from the level below. This ensures that the ego agent is exposed to adaptive behavior dur...