[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines

[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines

arXiv - AI 4 min read Article

Summary

The paper presents Contrastive World Models (CWM) for enhancing action feasibility learning in embodied agents, improving action scoring through a contrastive training approach.

Why It Matters

This research addresses a critical challenge in robotics and AI by improving how agents assess the feasibility of actions in real-time environments. By enhancing action scoring mechanisms, it can lead to safer and more efficient robotic systems, which is vital for applications in various fields, including autonomous vehicles and service robots.

Key Takeaways

  • CWM improves action feasibility scoring by using a contrastive training method.
  • Outperforms traditional supervised fine-tuning (SFT) by a significant margin.
  • Demonstrates better safety margins in action ranking under stress conditions.

Computer Science > Artificial Intelligence arXiv:2602.22452 (cs) [Submitted on 25 Feb 2026] Title:CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines Authors:Chayan Banerjee View a PDF of the paper titled CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines, by Chayan Banerjee View PDF HTML (experimental) Abstract:A reliable action feasibility scorer is a critical bottleneck in embodied agent pipelines: before any planning or reasoning occurs, the agent must identify which candidate actions are physically executable in the current state. Existing approaches use supervised fine-tuning (SFT) to train action scorers, but SFT treats each candidate independently and does not explicitly teach the model to discriminate between actions that are physically correct and those that are subtly wrong. We propose the Contrastive World Model (CWM), which fine-tunes a large language model (LLM) as an action scorer using an InfoNCE contrastive objective with hard-mined negative examples. The key idea is to push valid actions away from invalid ones in scoring space, with special emphasis on hard negatives: semantically similar but physically incompatible candidates. We evaluate CWM on the ScienceWorld benchmark through two studies. First, an intrinsic affordance evaluation on 605 hard-negative test pairs shows that CWM outperforms SFT by +6.76 percentage points on Precision@1 for minimal-edit negatives -- cases where ...

Related Articles

AI chip startup Rebellions raises $400 million at $2.3B valuation in pre-IPO round | TechCrunch
Machine Learning

AI chip startup Rebellions raises $400 million at $2.3B valuation in pre-IPO round | TechCrunch

The startup, which is planning to go public later this year, designs chips specifically for AI inference, another challenger to Nvidia's ...

TechCrunch - AI · 4 min ·
Llms

CLI for Google AI Search (gai.google) — run AI-powered code/tech searches headlessly from your terminal

Google AI (gai.google) gives Gemini-powered answers for technical queries — think AI-enhanced search with code understanding. I built a C...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Big increase in the amount of people using AI to write their replies with AI

I find it interesting that we’ve all randomly decided to use the “-“ more often recently on reddit, and everyone’s grammar has drasticall...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime