[2603.03269] LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory
About this article
Abstract page for arXiv paper 2603.03269: LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.03269 (cs) [Submitted on 3 Mar 2026] Title:LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory Authors:Junyi Zhang, Charles Herrmann, Junhwa Hur, Chen Sun, Ming-Hsuan Yang, Forrester Cole, Trevor Darrell, Deqing Sun View a PDF of the paper titled LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory, by Junyi Zhang and 7 other authors View PDF HTML (experimental) Abstract:Feedforward geometric foundation models achieve strong short-window reconstruction, yet scaling them to minutes-long videos is bottlenecked by quadratic attention complexity or limited effective memory in recurrent designs. We present LoGeR (Long-context Geometric Reconstruction), a novel architecture that scales dense 3D reconstruction to extremely long sequences without post-optimization. LoGeR processes video streams in chunks, leveraging strong bidirectional priors for high-fidelity intra-chunk reasoning. To manage the critical challenge of coherence across chunk boundaries, we propose a learning-based hybrid memory module. This dual-component system combines a parametric Test-Time Training (TTT) memory to anchor the global coordinate frame and prevent scale drift, alongside a non-parametric Sliding Window Attention (SWA) mechanism to preserve uncompressed context for high-precision adjacent alignment. Remarkably, this memory architecture enables LoGeR to be trained on sequences of 128 frames, and generalize up...