[2603.28108] Quid est VERITAS? A Modular Framework for Archival

[2603.28108] Quid est VERITAS? A Modular Framework for Archival Document Analysis

arXiv - AI March 31, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.28108: Quid est VERITAS? A Modular Framework for Archival Document Analysis

Computer Science > Digital Libraries arXiv:2603.28108 (cs) [Submitted on 30 Mar 2026] Title:Quid est VERITAS? A Modular Framework for Archival Document Analysis Authors:Leonardo Bassanini, Ludovico Biancardi, Alfio Ferrara, Andrea Gamberini, Sergio Picascia, Folco Vaglienti View a PDF of the paper titled Quid est VERITAS? A Modular Framework for Archival Document Analysis, by Leonardo Bassanini and 5 other authors View PDF HTML (experimental) Abstract:The digitisation of historical documents has traditionally been conceived as a process limited to character-level transcription, producing flat text that lacks the structural and semantic information necessary for substantive computational analysis. We present VERITAS (Vision-Enhanced Reading, Interpretation, and Transcription of Archival Sources), a modular, model-agnostic framework that reconceptualises digitisation as an integrated workflow encompassing transcription, layout analysis, and semantic enrichment. The pipeline is organised into four stages - Preprocessing, Extraction, Refinement, and Enrichment - and employs a schema-driven architecture that allows researchers to declaratively specify their extraction objectives. We evaluate VERITAS on the critical edition of Bernardino Corio's Storia di Milano, a Renaissance chronicle of over 1,600 pages. Results demonstrate that the pipeline achieves a 67.6% relative reduction in word error rate compared to a commercial OCR baseline, with a threefold reduction in end-to-end p...

Originally published on March 31, 2026. Curated by AI News.

Machine Learning

[R], 31 MILLIONS High frequency data, Light GBM worked perfectly

We just published a paper on predicting adverse selection in high-frequency crypto markets using LightGBM, and I wanted to share it here ...

Reddit - Machine Learning · 1 min · 26 minutes ago

Machine Learning

[D] Those of you with 10+ years in ML — what is the public completely wrong about?

For those of you who've been in ML/AI research or applied ML for 10+ years — what's the gap between what the public thinks AI is doing vs...

Reddit - Machine Learning · 1 min · 26 minutes ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

[2603.28108] Quid est VERITAS? A Modular Framework for Archival Document Analysis

About this article

Related Articles

[R], 31 MILLIONS High frequency data, Light GBM worked perfectly

[D] Those of you with 10+ years in ML — what is the public completely wrong about?

UMKC Announces New Master of Science in Artificial Intelligence

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

No comments

Stay updated with AI News