Robotics & Embodied AI

Physical AI, robots, and autonomous systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Robotics

What happens when AI agents can earn and spend real money? I built a small test to find out

I've been sitting with a question for a while: what happens when AI agents aren't just tools to be used, but participants in an economy? ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Robotics

AIPass Herald

Some insight onto building a muilti agent autonomous system. This is like the daily newspaper for the project. A quick read to see how ou...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Machine Learning

[2603.13846] Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video

Abstract page for arXiv paper 2603.13846: Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video

arXiv - AI · 3 min · about 7 hours ago

All Content

Robotics

The AI security nightmare is here and it looks suspiciously like lobster | The Verge

A hacker exploited a vulnerability in Cline's AI workflow, leading to the installation of OpenClaw, highlighting significant security ris...

The Verge - AI · 4 min · about 1 month ago

Robotics

AI-powered kung fu robots are an extravagant reminder of where China is ahead of the US in the AI race

The article discusses China's advancements in AI, particularly through the lens of AI-powered kung fu robots, highlighting the technologi...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Ai Startups

Freeform raises $67M Series B to scale up laser AI manufacturing | TechCrunch

Freeform has raised $67 million in Series B funding to enhance its AI-driven metal 3D printing technology, aiming to scale production and...

TechCrunch - AI · 5 min · about 1 month ago

Robotics

The Download: Autonomous narco submarines, and virtue signaling chatbots | MIT Technology Review

This edition of The Download covers advancements in autonomous narco submarines, ethical concerns surrounding AI chatbots, and the evolvi...

MIT Technology Review · 7 min · about 1 month ago

Robotics

[2507.08831] View Invariant Learning for Vision-Language Navigation in Continuous Environments

This paper introduces View Invariant Learning (VIL) for enhancing Vision-Language Navigation in Continuous Environments (VLNCE), addressi...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.12281] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

This paper explores the effectiveness of test-time verification over policy learning in enhancing Vision-Language-Action (VLA) alignment,...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.07680] Vision and Language: Novel Representations and Artificial intelligence for Driving Scene Safety Assessment and Autonomous Vehicle Planning

This paper explores the integration of vision-language models in autonomous driving, focusing on safety assessment and decision-making th...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2601.05378] Inverting Non-Injective Functions with Twin Neural Network Regression

This article presents a novel approach to inverting non-injective functions using Twin Neural Network Regression, focusing on locally inv...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

The paper presents FreqPolicy, a novel flow-based visuomotor policy that enhances efficiency in robotic manipulation by imposing frequenc...

arXiv - AI · 4 min · about 1 month ago

Robotics

[2504.08603] FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

The paper presents FindAnything, a framework for open-vocabulary and object-centric mapping that enhances robot exploration in unknown en...

arXiv - AI · 4 min · about 1 month ago

Nlp

[2412.10999] Cocoa: Co-Planning and Co-Execution with AI Agents

The paper presents Cocoa, a system designed to enhance human-agent collaboration in AI tasks by allowing flexible co-planning and co-exec...

arXiv - AI · 4 min · about 1 month ago

Llms

[2411.16537] RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics

The paper presents RoboSpatial, a dataset aimed at enhancing spatial understanding in robotics by providing 2D and 3D vision-language mod...

arXiv - AI · 4 min · about 1 month ago

Llms

[2505.24157] Experience-based Knowledge Correction for Robust Planning in Minecraft

The paper presents XENON, an advanced agent for robust planning in Minecraft that utilizes experience-based knowledge correction to impro...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2505.12707] PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI

PLAICraft introduces a large-scale dataset capturing time-aligned vision, speech, and action data from multiplayer Minecraft, aimed at ad...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2503.10265] SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Robotic Surgical Video Analysis

The article presents SurgRAW, a multi-agent workflow utilizing Chain of Thought reasoning for enhanced robotic surgical video analysis, a...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.16590] A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification

This paper presents CLIP-MHAdapter, a novel contrastive learning framework that enhances street-view image classification by using attent...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.16444] RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation

RoboGene introduces a framework for automating the generation of diverse, physically plausible robotic manipulation tasks, addressing the...

arXiv - AI · 4 min · about 1 month ago

Robotics

[2602.16356] Articulated 3D Scene Graphs for Open-World Mobile Manipulation

This paper presents MoMa-SG, a framework for creating semantic-kinematic 3D scene graphs to enhance mobile manipulation of articulated ob...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.16187] SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks

The paper presents SIT-LMPC, a novel algorithm for safe information-theoretic learning model predictive control tailored for robots perfo...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.15922] World Action Models are Zero-shot Policies

The paper introduces DreamZero, a World Action Model (WAM) that enhances zero-shot policy learning for robotic tasks by predicting future...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 41 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Robotics & Embodied AI

Top This Week

What happens when AI agents can earn and spend real money? I built a small test to find out

AIPass Herald

[2603.13846] Is Seeing Believing? Evaluating Human Sensitivity to Synthetic Video

All Content

The AI security nightmare is here and it looks suspiciously like lobster | The Verge

AI-powered kung fu robots are an extravagant reminder of where China is ahead of the US in the AI race

Freeform raises $67M Series B to scale up laser AI manufacturing | TechCrunch

The Download: Autonomous narco submarines, and virtue signaling chatbots | MIT Technology Review

[2507.08831] View Invariant Learning for Vision-Language Navigation in Continuous Environments

[2602.12281] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

[2602.07680] Vision and Language: Novel Representations and Artificial intelligence for Driving Scene Safety Assessment and Autonomous Vehicle Planning

[2601.05378] Inverting Non-Injective Functions with Twin Neural Network Regression

[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

[2504.08603] FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

[2412.10999] Cocoa: Co-Planning and Co-Execution with AI Agents

[2411.16537] RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics

[2505.24157] Experience-based Knowledge Correction for Robust Planning in Minecraft

[2505.12707] PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI

[2503.10265] SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Robotic Surgical Video Analysis

[2602.16590] A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification

[2602.16444] RoboGene: Boosting VLA Pre-training via Diversity-Driven Agentic Framework for Real-World Task Generation

[2602.16356] Articulated 3D Scene Graphs for Open-World Mobile Manipulation

[2602.16187] SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks

[2602.15922] World Action Models are Zero-shot Policies

Related Topics

Stay updated with AI News