[2512.02966] Lumos: Let there be Language Model System Certification

[2512.02966] Lumos: Let there be Language Model System Certification

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2512.02966: Lumos: Let there be Language Model System Certification

Computer Science > Programming Languages arXiv:2512.02966 (cs) [Submitted on 2 Dec 2025 (v1), last revised 31 Mar 2026 (this version, v2)] Title:Lumos: Let there be Language Model System Certification Authors:Isha Chaudhary, Vedaant Jain, Prineet Parhar, Kavya Sachdeva, Avaljot Singh, Sayan Ranu, Gagandeep Singh View a PDF of the paper titled Lumos: Let there be Language Model System Certification, by Isha Chaudhary and 6 other authors View PDF Abstract:We introduce the first principled framework, Lumos, for specifying and formally certifying Language Model System (LMS) behaviors. Lumos is an imperative probabilistic programming DSL over graphs, with constructs to generate independent and identically distributed prompts for LMS. It offers a structured view of prompt distributions via graphs, forming random prompts from sampled subgraphs. Lumos supports certifying LMS for arbitrary prompt distributions via integration with statistical certifiers. We provide hybrid (operational and denotational) semantics for Lumos, providing a rigorous way to interpret the specifications. Using only a small set of composable constructs, Lumos can encode existing LMS specifications, including complex relational and temporal specifications. It also facilitates specifying new properties - we present the first safety specifications for vision-language models (VLMs) in autonomous driving scenarios developed with Lumos. Using these, we show that the state-of-the-art VLM Qwen-VL exhibits critical ...

Originally published on April 02, 2026. Curated by AI News.

Related Articles

[2602.00750] Bypassing Prompt Injection Detectors through Evasive Injections
Llms

[2602.00750] Bypassing Prompt Injection Detectors through Evasive Injections

Abstract page for arXiv paper 2602.00750: Bypassing Prompt Injection Detectors through Evasive Injections

arXiv - AI · 4 min ·
[2511.08225] Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback
Llms

[2511.08225] Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

Abstract page for arXiv paper 2511.08225: Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

arXiv - AI · 4 min ·
[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling
Llms

[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

Abstract page for arXiv paper 2511.20224: DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

arXiv - AI · 3 min ·
[2506.09354] "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions
Llms

[2506.09354] "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in LLM-Supported Interactions

Abstract page for arXiv paper 2506.09354: "Is This Really a Human Peer Supporter?": Misalignments Between Peer Supporters and Experts in ...

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime