[2603.20164] The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning
About this article
Abstract page for arXiv paper 2603.20164: The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning
Computer Science > Robotics arXiv:2603.20164 (cs) [Submitted on 20 Mar 2026] Title:The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning Authors:Jiyu Lim, Youngwoo Yoon, Kwanghyun Park View a PDF of the paper titled The Robot's Inner Critic: Self-Refinement of Social Behaviors through VLM-based Replanning, by Jiyu Lim and 2 other authors View PDF HTML (experimental) Abstract:Conventional robot social behavior generation has been limited in flexibility and autonomy, relying on predefined motions or human feedback. This study proposes CRISP (Critique-and-Replan for Interactive Social Presence), an autonomous framework where a robot critiques and replans its own actions by leveraging a Vision-Language Model (VLM) as a `human-like social critic.' CRISP integrates (1) extraction of movable joints and constraints by analyzing the robot's description file (e.g., MJCF), (2) generation of step-by-step behavior plans based on situational context, (3) generation of low-level joint control code by referencing visual information (joint range-of-motion visualizations), (4) VLM-based evaluation of social appropriateness and naturalness, including pinpointing erroneous steps, and (5) iterative refinement of behaviors through reward-based search. This approach is not tied to a specific robot API; it can generate subtly different, human-like motions on various platforms using only the robot's structure file. In a user study involving five different robot...