Robotics & Embodied AI

Physical AI, robots, and autonomous systems

Top This Week

[2601.07855] RoAD Benchmark: How LiDAR Models Fail under Coupled Domain Shifts and Label Evolution
Machine Learning

[2601.07855] RoAD Benchmark: How LiDAR Models Fail under Coupled Domain Shifts and Label Evolution

Abstract page for arXiv paper 2601.07855: RoAD Benchmark: How LiDAR Models Fail under Coupled Domain Shifts and Label Evolution

arXiv - AI · 3 min ·
[2502.00262] INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation
Llms

[2502.00262] INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Hazard Detection and Edge Case Evaluation

Abstract page for arXiv paper 2502.00262: INSIGHT: Enhancing Autonomous Driving Safety through Vision-Language Models on Context-Aware Ha...

arXiv - AI · 4 min ·
[2508.00500] ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety
Llms

[2508.00500] ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety

Abstract page for arXiv paper 2508.00500: ProbGuard: Probabilistic Runtime Monitoring for LLM Agent Safety

arXiv - AI · 4 min ·

All Content

[2506.21427] Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning
Machine Learning

[2506.21427] Flow-Based Single-Step Completion for Efficient and Expressive Policy Learning

The paper presents the Single-Step Completion Policy (SSCP), a novel approach in reinforcement learning that enhances efficiency and expr...

arXiv - Machine Learning · 4 min ·
[2602.22056] FlowCorrect: Efficient Interactive Correction of Generative Flow Policies for Robotic Manipulation
Machine Learning

[2602.22056] FlowCorrect: Efficient Interactive Correction of Generative Flow Policies for Robotic Manipulation

The paper presents FlowCorrect, a framework for correcting generative flow policies in robotic manipulation using minimal human input, im...

arXiv - Machine Learning · 3 min ·
[2602.21873] GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task
Machine Learning

[2602.21873] GFPL: Generative Federated Prototype Learning for Resource-Constrained and Data-Imbalanced Vision Task

The GFPL framework enhances federated learning by addressing data imbalance and communication overhead in resource-constrained vision tas...

arXiv - Machine Learning · 4 min ·
[2602.21783] Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control
Machine Learning

[2602.21783] Therapist-Robot-Patient Physical Interaction is Worth a Thousand Words: Enabling Intuitive Therapist Guidance via Remote Haptic Control

This paper presents a haptic teleoperation system that enables therapists to remotely guide patients using an arm exoskeleton, enhancing ...

arXiv - Machine Learning · 4 min ·
[2602.21684] Primary-Fine Decoupling for Action Generation in Robotic Imitation
Machine Learning

[2602.21684] Primary-Fine Decoupling for Action Generation in Robotic Imitation

The paper presents a two-stage framework, Primary-Fine Decoupling for Action Generation (PF-DAG), aimed at improving action generation in...

arXiv - Machine Learning · 4 min ·
[2602.21965] Compact Circulant Layers with Spectral Priors
Machine Learning

[2602.21965] Compact Circulant Layers with Spectral Priors

This paper explores compact circulant layers with spectral priors, focusing on their application in memory-efficient neural networks for ...

arXiv - Machine Learning · 3 min ·
[2602.21320] Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data
Llms

[2602.21320] Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

The paper presents Tool-R0, a framework for training self-evolving LLM agents capable of tool-learning without prior data, showcasing sig...

arXiv - Machine Learning · 4 min ·
[2602.21319] Uncertainty-Aware Diffusion Model for Multimodal Highway Trajectory Prediction via DDIM Sampling
Machine Learning

[2602.21319] Uncertainty-Aware Diffusion Model for Multimodal Highway Trajectory Prediction via DDIM Sampling

The paper presents cVMDx, an advanced diffusion model for multimodal highway trajectory prediction, enhancing efficiency and accuracy in ...

arXiv - Machine Learning · 3 min ·
[2512.08639] Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
Nlp

[2512.08639] Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

This article presents a unified framework for Aerial Vision-Language Navigation (VLN), enabling UAVs to interpret natural language and na...

arXiv - AI · 4 min ·
[2511.00062] World Simulation with Video Foundation Models for Physical AI
Llms

[2511.00062] World Simulation with Video Foundation Models for Physical AI

The paper presents Cosmos-Predict2.5, an advanced model for world simulation in Physical AI, integrating various generation methods and i...

arXiv - Machine Learning · 5 min ·
[2510.18060] SPACeR: Self-Play Anchoring with Centralized Reference Models
Machine Learning

[2510.18060] SPACeR: Self-Play Anchoring with Centralized Reference Models

The paper introduces SPACeR, a framework for enhancing autonomous vehicle behavior through self-play reinforcement learning anchored by a...

arXiv - Machine Learning · 4 min ·
[2510.10472] FML-bench: Benchmarking Machine Learning Agents for Scientific Research
Llms

[2510.10472] FML-bench: Benchmarking Machine Learning Agents for Scientific Research

The paper introduces FML-bench, a new benchmark for evaluating machine learning agents in scientific research, focusing on exploration di...

arXiv - AI · 4 min ·
[2510.00024] EpidemIQs: Prompt-to-Paper LLM Agents for Epidemic Modeling and Analysis
Llms

[2510.00024] EpidemIQs: Prompt-to-Paper LLM Agents for Epidemic Modeling and Analysis

The paper presents EpidemIQs, a multi-agent framework utilizing large language models for efficient epidemic modeling, demonstrating impr...

arXiv - AI · 4 min ·
[2508.21112] EO-1: An Open Unified Embodied Foundation Model for General Robot Control
Llms

[2508.21112] EO-1: An Open Unified Embodied Foundation Model for General Robot Control

The EO-1 model is introduced as a unified foundation for general robot control, enhancing multimodal reasoning through a large dataset an...

arXiv - AI · 4 min ·
[2602.22197] Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes
Machine Learning

[2602.22197] Off-The-Shelf Image-to-Image Models Are All You Need To Defeat Image Protection Schemes

This paper demonstrates that off-the-shelf image-to-image models can effectively defeat various image protection schemes, highlighting a ...

arXiv - AI · 4 min ·
[2602.22190] GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL
Machine Learning

[2602.22190] GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL

The paper presents GUI-Libra, a novel training approach for native GUI agents that enhances reasoning and action capabilities through act...

arXiv - Machine Learning · 4 min ·
[2602.21706] SurGo-R1: Benchmarking and Modeling Contextual Reasoning for Operative Zone in Surgical Video
Machine Learning

[2602.21706] SurGo-R1: Benchmarking and Modeling Contextual Reasoning for Operative Zone in Surgical Video

The paper presents SurGo-R1, a model designed to enhance contextual reasoning in surgical video analysis, addressing challenges in identi...

arXiv - AI · 4 min ·
[2602.21670] Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning
Llms

[2602.21670] Hierarchical LLM-Based Multi-Agent Framework with Prompt Optimization for Multi-Robot Task Planning

This article presents a novel hierarchical framework for multi-robot task planning using large language models (LLMs) with prompt optimiz...

arXiv - AI · 4 min ·
[2602.21633] Self-Correcting VLA: Online Action Refinement via Sparse World Imagination
Machine Learning

[2602.21633] Self-Correcting VLA: Online Action Refinement via Sparse World Imagination

The paper presents Self-Correcting VLA, a novel approach in robotics that enhances vision-language-action models by integrating sparse wo...

arXiv - AI · 4 min ·
[2602.21611] Structurally Aligned Subtask-Level Memory for Software Engineering Agents
Llms

[2602.21611] Structurally Aligned Subtask-Level Memory for Software Engineering Agents

The paper presents Structurally Aligned Subtask-Level Memory, a novel approach for enhancing software engineering agents by improving mem...

arXiv - AI · 3 min ·
Previous Page 26 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime