[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines
Summary
The paper presents Contrastive World Models (CWM) for enhancing action feasibility learning in embodied agents, improving action scoring through a contrastive training approach.
Why It Matters
This research addresses a critical challenge in robotics and AI by improving how agents assess the feasibility of actions in real-time environments. By enhancing action scoring mechanisms, it can lead to safer and more efficient robotic systems, which is vital for applications in various fields, including autonomous vehicles and service robots.
Key Takeaways
- CWM improves action feasibility scoring by using a contrastive training method.
- Outperforms traditional supervised fine-tuning (SFT) by a significant margin.
- Demonstrates better safety margins in action ranking under stress conditions.
Computer Science > Artificial Intelligence arXiv:2602.22452 (cs) [Submitted on 25 Feb 2026] Title:CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines Authors:Chayan Banerjee View a PDF of the paper titled CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines, by Chayan Banerjee View PDF HTML (experimental) Abstract:A reliable action feasibility scorer is a critical bottleneck in embodied agent pipelines: before any planning or reasoning occurs, the agent must identify which candidate actions are physically executable in the current state. Existing approaches use supervised fine-tuning (SFT) to train action scorers, but SFT treats each candidate independently and does not explicitly teach the model to discriminate between actions that are physically correct and those that are subtly wrong. We propose the Contrastive World Model (CWM), which fine-tunes a large language model (LLM) as an action scorer using an InfoNCE contrastive objective with hard-mined negative examples. The key idea is to push valid actions away from invalid ones in scoring space, with special emphasis on hard negatives: semantically similar but physically incompatible candidates. We evaluate CWM on the ScienceWorld benchmark through two studies. First, an intrinsic affordance evaluation on 605 hard-negative test pairs shows that CWM outperforms SFT by +6.76 percentage points on Precision@1 for minimal-edit negatives -- cases where ...