[2603.28028] Efficient Domain Adaptation for Text Line Recognition via Decoupled Language Models
About this article
Abstract page for arXiv paper 2603.28028: Efficient Domain Adaptation for Text Line Recognition via Decoupled Language Models
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.28028 (cs) [Submitted on 30 Mar 2026] Title:Efficient Domain Adaptation for Text Line Recognition via Decoupled Language Models Authors:Arundhathi Dev, Justin Zhan View a PDF of the paper titled Efficient Domain Adaptation for Text Line Recognition via Decoupled Language Models, by Arundhathi Dev and 1 other authors View PDF HTML (experimental) Abstract:Optical character recognition remains critical infrastructure for document digitization, yet state-of-the-art performance is often restricted to well-resourced institutions by prohibitive computational barriers. End-to-end transformer architectures achieve strong accuracy but demand hundreds of GPU hours for domain adaptation, limiting accessibility for practitioners and digital humanities scholars. We present a modular detection-and-correction framework that achieves near-SOTA accuracy with single-GPU training. Our approach decouples lightweight visual character detection (domain-agnostic) from domain-specific linguistic correction using pretrained sequence models including T5, ByT5, and BART. By training the correctors entirely on synthetic noise, we enable annotation-free domain adaptation without requiring labeled target images. Evaluating across modern clean handwriting, cursive script, and historical documents, we identify a critical "Pareto frontier" in architecture selection: T5-Base excels on modern text with standard vocabulary, whereas ByT5-Bas...