[2602.13345] BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents

[2602.13345] BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents

arXiv - Machine Learning 3 min read Article

Summary

The paper presents Blueprint, a multimodal retrieval system designed to enhance the accessibility of complex engineering drawings and documents by automating metadata generation and improving search capabilities.

Why It Matters

With many engineering documents trapped in legacy systems, Blueprint addresses a critical need for efficient retrieval and organization. Its innovative approach combines vision-language models with advanced OCR techniques, making it easier for organizations to access and utilize historical engineering data, which can lead to improved productivity and innovation in engineering fields.

Key Takeaways

  • Blueprint automates the retrieval of complex engineering documents.
  • The system significantly improves search accuracy with a 10.1% gain in Success@3.
  • It uses region-aware multimodal techniques to enhance metadata generation.
  • The research includes a comprehensive evaluation with expert-curated queries.
  • All related resources are made available for reproducibility and further research.

Computer Science > Machine Learning arXiv:2602.13345 (cs) [Submitted on 12 Feb 2026] Title:BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents Authors:Ethan Seefried, Ran Eldegaway, Sanjay Das, Nathaniel Blanchard, Tirthankar Ghosal View a PDF of the paper titled BLUEPRINT Rebuilding a Legacy: Multimodal Retrieval for Complex Engineering Drawings and Documents, by Ethan Seefried and 4 other authors View PDF HTML (experimental) Abstract:Decades of engineering drawings and technical records remain locked in legacy archives with inconsistent or missing metadata, making retrieval difficult and often manual. We present Blueprint, a layout-aware multimodal retrieval system designed for large-scale engineering repositories. Blueprint detects canonical drawing regions, applies region-restricted VLM-based OCR, normalizes identifiers (e.g., DWG, part, facility), and fuses lexical and dense retrieval with a lightweight region-level reranker. Deployed on ~770k unlabeled files, it automatically produces structured metadata suitable for cross-facility search. We evaluate Blueprint on a 5k-file benchmark with 350 expert-curated queries using pooled, graded (0/1/2) relevance judgments. Blueprint delivers a 10.1% absolute gain in Success@3 and an 18.9% relative improvement in nDCG@3 over the strongest vision-language baseline}, consistently outperforming across vision, text, and multimodal intents. Oracle ablations reveal substantial headroom ...

Related Articles

The Galaxy S26’s photo app can sloppify your memories | The Verge
Nlp

The Galaxy S26’s photo app can sloppify your memories | The Verge

Samsung’s S26 series offers some new AI photo editing capabilities to transform your photos. But where’s the line between acceptable edit...

The Verge - AI · 8 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min ·
Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime