Generative Ai Robotics Computer Vision Ai Agents

[2602.19565] DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces

arXiv - AI February 24, 2026 4 min read Article

Summary

DICArt introduces a novel framework for category-level articulated object pose estimation, utilizing a discrete diffusion process to enhance modeling fidelity and performance in complex environments.

Why It Matters

This research addresses critical challenges in articulated object pose estimation, a key component of embodied AI. By proposing a discrete diffusion approach, it enhances the accuracy and robustness of pose estimation, which is vital for applications in robotics and computer vision.

Key Takeaways

DICArt formulates pose estimation as a conditional discrete diffusion process.
The framework improves modeling fidelity through a flexible flow decider.
Hierarchical kinematic coupling is used to respect the object's structure.
Experimental results show superior performance on synthetic and real-world datasets.
DICArt offers a new paradigm for reliable 6D pose estimation.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.19565 (cs) [Submitted on 23 Feb 2026] Title:DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces Authors:Li Zhang, Mingyu Mei, Ailing Wang, Xianhui Meng, Yan Zhong, Xinyuan Song, Liu Liu, Rujing Wang, Zaixing He, Cewu Lu View a PDF of the paper titled DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces, by Li Zhang and 9 other authors View PDF HTML (experimental) Abstract:Articulated object pose estimation is a core task in embodied AI. Existing methods typically regress poses in a continuous space, but often struggle with 1) navigating a large, complex search space and 2) failing to incorporate intrinsic kinematic constraints. In this work, we introduce DICArt (DIsCrete Diffusion for Articulation Pose Estimation), a novel framework that formulates pose estimation as a conditional discrete diffusion process. Instead of operating in a continuous domain, DICArt progressively denoises a noisy pose representation through a learned reverse diffusion procedure to recover the GT pose. To improve modeling fidelity, we propose a flexible flow decider that dynamically determines whether each token should be denoised or reset, effectively balancing the real and noise distributions during diffusion. Additionally, we incorporate a hierarchical kinematic coupling strategy, estimating the pose of each rigid part hierarchically to respect t...

Read Original Article

[2602.19565] DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces

Summary

Why It Matters

Key Takeaways

Related Articles

[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control

[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images

[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting

[2510.21011] Generating the Modal Worker: A Cross-Model Audit of Race and Gender in LLM-Generated Personas Across 41 Occupations

No comments

Stay updated with AI News