Machine Learning Nlp Computer Vision Ai Agents

[2602.14771] GOT-JEPA: Generic Object Tracking with Model Adaptation and Occlusion Handling using Joint-Embedding Predictive Architecture

arXiv - AI February 17, 2026 4 min read Article

Summary

GOT-JEPA introduces a novel framework for generic object tracking that enhances model adaptation and occlusion handling, improving robustness and generalization in dynamic environments.

Why It Matters

This research addresses significant limitations in current object tracking methods, particularly their inability to handle occlusions and adapt to unseen scenarios. By improving generalization and occlusion perception, GOT-JEPA has implications for various applications in computer vision, including surveillance and autonomous systems.

Key Takeaways

GOT-JEPA enhances object tracking by integrating model adaptation and occlusion handling.
The framework uses a teacher-student model to generate and learn pseudo-tracking models.
OccuSolver improves occlusion perception and visibility estimation for better tracking performance.
Extensive evaluations demonstrate improved generalization across multiple benchmarks.
The approach is relevant for applications requiring robust tracking in dynamic environments.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.14771 (cs) [Submitted on 16 Feb 2026] Title:GOT-JEPA: Generic Object Tracking with Model Adaptation and Occlusion Handling using Joint-Embedding Predictive Architecture Authors:Shih-Fang Chen, Jun-Cheng Chen, I-Hong Jhuo, Yen-Yu Lin View a PDF of the paper titled GOT-JEPA: Generic Object Tracking with Model Adaptation and Occlusion Handling using Joint-Embedding Predictive Architecture, by Shih-Fang Chen and 3 other authors View PDF HTML (experimental) Abstract:The human visual system tracks objects by integrating current observations with previously observed information, adapting to target and scene changes, and reasoning about occlusion at fine granularity. In contrast, recent generic object trackers are often optimized for training targets, which limits robustness and generalization in unseen scenarios, and their occlusion reasoning remains coarse, lacking detailed modeling of occlusion patterns. To address these limitations in generalization and occlusion perception, we propose GOT-JEPA, a model-predictive pretraining framework that extends JEPA from predicting image features to predicting tracking models. Given identical historical information, a teacher predictor generates pseudo-tracking models from a clean current frame, and a student predictor learns to predict the same pseudo-tracking models from a corrupted version of the current frame. This design provides stable pseudo supervision and explic...

Read Original Article

[2602.14771] GOT-JEPA: Generic Object Tracking with Model Adaptation and Occlusion Handling using Joint-Embedding Predictive Architecture

Summary

Why It Matters

Key Takeaways

Related Articles

[D] I had an idea, would love your thoughts

I had an idea, would love your thoughts

AI benchmarks are broken. Here’s what we need instead. | MIT Technology Review

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

No comments

Stay updated with AI News