[2506.04867] Sensory-Motor Control with Large Language Models via Iterative Policy Refinement
Summary
This paper presents a novel method for enabling large language models (LLMs) to control embodied agents through iterative policy refinement, improving performance in sensory-motor tasks.
Why It Matters
As AI continues to evolve, integrating language models with sensory-motor control can enhance robotic capabilities and human-computer interaction. This research could lead to more effective autonomous systems that learn and adapt in real-time, bridging the gap between symbolic reasoning and sub-symbolic data.
Key Takeaways
- LLMs can generate control strategies for embodied agents based on textual descriptions.
- The proposed method refines strategies iteratively using performance feedback.
- Validation on classic control tasks demonstrates effectiveness with compact models.
- The approach integrates symbolic reasoning with sensory-motor data.
- Potential applications include improved autonomous systems and robotics.
Computer Science > Artificial Intelligence arXiv:2506.04867 (cs) [Submitted on 5 Jun 2025 (v1), last revised 24 Feb 2026 (this version, v4)] Title:Sensory-Motor Control with Large Language Models via Iterative Policy Refinement Authors:Jônata Tyska Carvalho, Stefano Nolfi View a PDF of the paper titled Sensory-Motor Control with Large Language Models via Iterative Policy Refinement, by J\^onata Tyska Carvalho and Stefano Nolfi View PDF HTML (experimental) Abstract:We propose a method that enables large language models (LLMs) to control embodied agents through the generation of control policies that directly map continuous observation vectors to continuous action vectors. At the outset, the LLMs generate a control strategy based on a textual description of the agent, its environment, and the intended goal. This strategy is then iteratively refined through a learning process in which the LLMs are repeatedly prompted to improve the current strategy, using performance feedback and sensory-motor data collected during its evaluation. The method is validated on classic control tasks from the Gymnasium library and the inverted pendulum task from the MuJoCo library. The approach proves effective with relatively compact models such as GPT-oss:120b and Qwen2.5:72b. In most cases, it successfully identifies optimal or near-optimal solutions by integrating symbolic knowledge derived through reasoning with sub-symbolic sensory-motor data gathered as the agent interacts with its environm...