[2604.08718] Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring
About this article
Abstract page for arXiv paper 2604.08718: Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring
Computer Science > Computer Vision and Pattern Recognition arXiv:2604.08718 (cs) [Submitted on 9 Apr 2026] Title:Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring Authors:Xinmiao Xiong, Bangya Liu, Hao Wang, Dayou Li, Nuo Chen, Andrew Feng, Mingyu Ding, Suman Banerjee, Yang Zhou, Zhiwen Fan View a PDF of the paper titled Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring, by Xinmiao Xiong and 9 other authors View PDF HTML (experimental) Abstract:Geometric Foundation Models (GFMs) have recently advanced monocular SLAM by providing robust, calibration-free 3D priors. However, deploying these models on dense video streams introduces significant computational redundancy. Current GFM-based SLAM systems typically rely on post hoc keyframe selection. Because of this, they must perform expensive dense geometric decoding simply to determine whether a frame contains novel geometry, resulting in late rejection and wasted computation. To mitigate this inefficiency, we propose LeanGate, a lightweight feed-forward frame-gating network. LeanGate predicts a geometric utility score to assess a frame's mapping value prior to the heavy GFM feature extraction and matching stages. As a predictive plug-and-play module, our approach bypasses over 90% of redundant frames. Evaluations on standard SLAM benchmarks demonstrate that LeanGate reduces tracking FLOPs by more than 85% and achieves a 5x end-to-end throughput speedup. Furthermore, it mainta...