Machine Learning Robotics Ai Agents Computer Vision

[2602.13444] FlowHOI: Flow-based Semantics-Grounded Generation of Hand-Object Interactions for Dexterous Robot Manipulation

arXiv - AI February 17, 2026 4 min read Article

Summary

FlowHOI presents a novel framework for generating hand-object interactions in robotic manipulation, enhancing the realism and efficiency of robot task execution.

Why It Matters

This research addresses a critical gap in robotics by providing a structured approach to hand-object interactions, which are essential for effective robot manipulation. The proposed FlowHOI framework improves the accuracy and speed of robotic actions, making it a significant advancement in the field of robotics and AI.

Key Takeaways

FlowHOI introduces a two-stage flow-matching framework for generating hand-object interactions.
The framework enhances action recognition accuracy and physics simulation success rates.
It achieves a significant speedup in inference times compared to existing methods.
Real-robot execution demonstrates the practical applicability of the generated interactions.
The research addresses the scarcity of high-fidelity supervision for hand-object interactions.

Computer Science > Robotics arXiv:2602.13444 (cs) [Submitted on 13 Feb 2026] Title:FlowHOI: Flow-based Semantics-Grounded Generation of Hand-Object Interactions for Dexterous Robot Manipulation Authors:Huajian Zeng, Lingyun Chen, Jiaqi Yang, Yuantai Zhang, Fan Shi, Peidong Liu, Xingxing Zuo View a PDF of the paper titled FlowHOI: Flow-based Semantics-Grounded Generation of Hand-Object Interactions for Dexterous Robot Manipulation, by Huajian Zeng and Lingyun Chen and Jiaqi Yang and Yuantai Zhang and Fan Shi and Peidong Liu and Xingxing Zuo View PDF HTML (experimental) Abstract:Recent vision-language-action (VLA) models can generate plausible end-effector motions, yet they often fail in long-horizon, contact-rich tasks because the underlying hand-object interaction (HOI) structure is not explicitly represented. An embodiment-agnostic interaction representation that captures this structure would make manipulation behaviors easier to validate and transfer across robots. We propose FlowHOI, a two-stage flow-matching framework that generates semantically grounded, temporally coherent HOI sequences, comprising hand poses, object poses, and hand-object contact states, conditioned on an egocentric observation, a language instruction, and a 3D Gaussian splatting (3DGS) scene reconstruction. We decouple geometry-centric grasping from semantics-centric manipulation, conditioning the latter on compact 3D scene tokens and employing a motion-text alignment loss to semantically ground th...

Read Original Article

[2602.13444] FlowHOI: Flow-based Semantics-Grounded Generation of Hand-Object Interactions for Dexterous Robot Manipulation

Summary

Why It Matters

Key Takeaways

Related Articles

[D] Does ML have a "bible"/reference textbook at the Intermediate/Advanced level?

[D] ICML 2026 review policy debate: 100 responses suggest Policy B may score higher, while Policy A shows higher confidence

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch

[D] Applied AI/Machine learning course by Srikanth Varma

No comments

Stay updated with AI News