[2602.03604] A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures
About this article
Abstract page for arXiv paper 2602.03604: A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.03604 (cs) [Submitted on 3 Feb 2026 (v1), last revised 8 Apr 2026 (this version, v3)] Title:A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures Authors:Basile Terver, Randall Balestriero, Megi Dervishi, David Fan, Quentin Garrido, Tushar Nagarajan, Koustuv Sinha, Wancong Zhang, Mike Rabbat, Yann LeCun, Amir Bar View a PDF of the paper titled A Lightweight Library for Energy-Based Joint-Embedding Predictive Architectures, by Basile Terver and 10 other authors View PDF HTML (experimental) Abstract:We present EB-JEPA, an open-source library for learning representations and world models using Joint-Embedding Predictive Architectures (JEPAs). JEPAs learn to predict in representation space rather than pixel space, avoiding the pitfalls of generative modeling while capturing semantically meaningful features suitable for downstream tasks. Our library provides modular, self-contained implementations that illustrate how representation learning techniques developed for image-level self-supervised learning can transfer to video, where temporal dynamics add complexity, and ultimately to action-conditioned world models, where the model must additionally learn to predict the effects of control inputs. Each example is designed for single-GPU training within a few hours, making energy-based self-supervised learning accessible for research and education. We provide ablations of JEA components o...