[2603.01494] Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision
About this article
Abstract page for arXiv paper 2603.01494: Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision
Computer Science > Software Engineering arXiv:2603.01494 (cs) [Submitted on 2 Mar 2026] Title:Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision Authors:Manisha Mukherjee, Vincent J. Hellendoorn View a PDF of the paper titled Inference-Time Safety For Code LLMs Via Retrieval-Augmented Revision, by Manisha Mukherjee and Vincent J. Hellendoorn View PDF HTML (experimental) Abstract:Large Language Models (LLMs) are increasingly deployed for code generation in high-stakes software development, yet their limited transparency in security reasoning and brittleness to evolving vulnerability patterns raise critical trustworthiness concerns. Models trained on static datasets cannot readily adapt to newly discovered vulnerabilities or changing security standards without retraining, leading to the repeated generation of unsafe code. We present a principled approach to trustworthy code generation by design that operates as an inference-time safety mechanism. Our approach employs retrieval-augmented generation to surface relevant security risks in generated code and retrieve related security discussions from a curated Stack Overflow knowledge base, which are then used to guide an LLM during code revision. This design emphasizes three aspects relevant to trustworthiness: (1) interpretability, through transparent safety interventions grounded in expert community explanations; (2) robustness, by allowing adaptation to evolving security practices without model retraining; a...