[2602.20517] Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

[2602.20517] Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

arXiv - Machine Learning 4 min read Article

Summary

The paper presents MIMIC, a framework that enhances human-AI coordination by using inner speech to guide behavior imitation in artificial agents, improving adaptability and diversity in responses.

Why It Matters

As AI systems increasingly interact with humans, the ability to mimic human-like behaviors and adapt to context is crucial. This research addresses limitations in current imitation learning methods, proposing a novel approach that leverages inner speech for better human-AI collaboration.

Key Takeaways

  • MIMIC uses inner speech as a guide for behavior imitation in AI.
  • The framework improves the diversity and fidelity of AI responses.
  • It allows for nuanced behavioral steering without requiring additional training.
  • Experiments show significant enhancements in robotic tasks and collaboration games.
  • The code and pre-trained models are open-sourced for further research.

Computer Science > Artificial Intelligence arXiv:2602.20517 (cs) [Submitted on 24 Feb 2026] Title:Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination Authors:Rakshit Trivedi, Kartik Sharma, David C Parkes View a PDF of the paper titled Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination, by Rakshit Trivedi and 2 other authors View PDF HTML (experimental) Abstract:Effective human-AI coordination requires artificial agents capable of exhibiting and responding to human-like behaviors while adapting to changing contexts. Imitation learning has emerged as one of the prominent approaches to build such agents by training them to mimic human-demonstrated behaviors. However, current methods struggle to capture the inherent diversity and non-Markovian nature of human behavior and lack the ability to steer behavior at inference time. Drawing inspiration from the theory of human cognitive processes, where inner speech guides action selection before execution, we propose MIMIC (Modeling Inner Motivations for Imitation and Control), a framework that uses language as an internal representation of behavioral intent. MIMIC employs the novel use of vision-language models as linguistic scaffolding to train a conditional variational autoencoder capable of generating inner speech from observations. A diffusion-based behavior cloning policy then selects actions conditioned on current observations and...

Related Articles

Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] Best websites for pytorch/numpy interviews

Hello, I’m at the last year of my PHD and I’m starting to prepare interviews. I’m mainly aiming at applied scientist/research engineer or...

Reddit - Machine Learning · 1 min ·
Llms

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://git...

Reddit - Machine Learning · 1 min ·
Machine Learning

Can AI truly be creative?

AI has no imagination. “Creativity is the ability to generate novel and valuable ideas or works through the exercise of imagination” http...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime