Machine Learning Ai Infrastructure Ai Startups Robotics Ai Agents

[2602.18813] Habilis-$β$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model

arXiv - Machine Learning February 24, 2026 4 min read Article

Summary

Habilis-$β$ is a new on-device vision-language-action model that excels in fast-motion tasks, demonstrating superior performance in real-world applications compared to existing models.

Why It Matters

This research introduces a significant advancement in robotics by addressing the limitations of current evaluation metrics for vision-language-action models. By focusing on continuous performance under real-world conditions, it sets a new standard for practical applications in robotics, enhancing efficiency and reliability.

Key Takeaways

Habilis-$β$ integrates language-free pre-training for robust interaction.
The Productivity-Reliability Plane (PRP) offers a new evaluation metric for VLA models.
In continuous-run tests, Habilis-$β$ significantly outperforms its predecessor, $eta_{0.5}$, in both simulation and real-world environments.
The model achieves the highest performance on the RoboTwin 2.0 leaderboard, validating its effectiveness.
Innovative techniques like ESPADA and rectified-flow distillation enhance motion control on edge devices.

Computer Science > Robotics arXiv:2602.18813 (cs) [Submitted on 21 Feb 2026] Title:Habilis-$β$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model Authors:Tommoro Robotics: Jesoon Kang, Taegeon Park, Jisu An, Soo Min Kimm, Jaejoon Kim, Jinu Pahk, Byungju Kim, Junseok Lee, Namheon Baek, Sungwan Ha, Hojun Baek, Eduardo Ayerve Cruz, Wontae Kim, Junghyeon Choi, Yousuk Lee, Joonmo Han, Sunghyun Cho, Sunghyun Kwon, Soyoung Lee, Jun Ki Lee, Seung-Joon Yi, Byoung-Tak Zhang, Theo Taeyeong Kim View a PDF of the paper titled Habilis-$\beta$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model, by Tommoro Robotics: Jesoon Kang and 22 other authors View PDF HTML (experimental) Abstract:We introduce Habilis-$\beta$, a fast-motion and long-lasting on-device vision-language-action (VLA) model designed for real-world deployment. Current VLA evaluation remains largely confined to single-trial success rates under curated resets, which fails to capture the fast-motion and long-lasting capabilities essential for practical operation. To address this, we introduce the Productivity-Reliability Plane (PRP), which evaluates performance through Tasks per Hour (TPH) and Mean Time Between Intervention (MTBI) under a continuous-run protocol that demands both high-speed execution and sustained robustness. Habilis-$\beta$ achieves high performance by integrating language-free pre-training on large-scale play data for robust interaction priors with post-training on cyc...

Read Original Article

[2602.18813] Habilis-$β$: A Fast-Motion and Long-Lasting On-Device Vision-Language-Action Model

Summary

Why It Matters

Key Takeaways

Related Articles

[R] Are there ML approaches for prioritizing and routing “important” signals across complex systems?

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

[R] Structure Over Scale: Memory-First Reasoning and Depth-Pruned Efficiency in Magnus and Seed Architecture Auto-Discovery

UM Computer Scientists Land Grant to Improve Models of Melting Greenland Glaciers

No comments

Stay updated with AI News