[2602.22217] RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge
Summary
The paper presents RAGdb, a novel architecture for Retrieval-Augmented Generation (RAG) that simplifies multimodal data processing by eliminating dependencies on complex cloud infrastructures, making it suitable for edge computing.
Why It Matters
RAGdb addresses the challenges of traditional RAG architectures, which often require extensive resources and infrastructure. By providing a zero-dependency solution, it enhances accessibility for edge computing applications, particularly in privacy-sensitive environments. This innovation could significantly impact the deployment of AI in decentralized and local-first settings.
Key Takeaways
- RAGdb consolidates multiple functionalities into a single SQLite container, reducing complexity.
- The architecture achieves high efficiency in data ingestion and retrieval without relying on GPU inference.
- It significantly decreases the storage requirements compared to traditional RAG systems.
- RAGdb is designed for edge computing, making it suitable for privacy-constrained applications.
- The proposed Hybrid Scoring Function enhances retrieval accuracy while maintaining performance.
Computer Science > Information Retrieval arXiv:2602.22217 (cs) [Submitted on 9 Dec 2025] Title:RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge Authors:Ahmed Bin Khalid View a PDF of the paper titled RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge, by Ahmed Bin Khalid View PDF HTML (experimental) Abstract:Retrieval-Augmented Generation (RAG) has established itself as the standard paradigm for grounding Large Language Models (LLMs) in domain-specific, up-to-date data. However, the prevailing architecture for RAG has evolved into a complex, distributed stack requiring cloud-hosted vector databases, heavy deep learning frameworks (e.g., PyTorch, CUDA), and high-latency embedding inference servers. This ``infrastructure bloat'' creates a significant barrier to entry for edge computing, air-gapped environments, and privacy-constrained applications where data sovereignty is paramount. This paper introduces RAGdb, a novel monolithic architecture that consolidates automated multimodal ingestion, ONNX-based extraction, and hybrid vector retrieval into a single, portable SQLite container. We propose a deterministic Hybrid Scoring Function (HSF) that combines sublinear TF-IDF vectorization with exact substring boosting, eliminating the need for GPU inference at query time. Experimental evaluation on an Intel i7-1165G7 consumer laptop demonstrates that RAGdb achieves ...