[2602.14012] From SFT to RL: Demystifying the Post-Training Pipeline for LLM-based Vulnerability Detection

[2602.14012] From SFT to RL: Demystifying the Post-Training Pipeline for LLM-based Vulnerability Detection

arXiv - AI 4 min read Article

Summary

This article explores the post-training pipeline for LLM-based vulnerability detection, detailing methods from supervised fine-tuning (SFT) to reinforcement learning (RL) and offering insights into effective model training strategies.

Why It Matters

As the integration of large language models (LLMs) into vulnerability detection becomes more prevalent, understanding the post-training processes is crucial for enhancing model performance and reliability. This research provides foundational insights that can guide future developments in AI-driven security solutions.

Key Takeaways

  • SFT based on rejection sampling outperforms rationalization-based supervision.
  • Excessive SFT can inhibit self-exploration during RL, limiting performance gains.
  • Fine-grained reward signals improve RL training efficiency compared to coarse-grained signals.
  • Filtering hard-to-detect vulnerabilities can enhance RL training but may incur performance costs.
  • Models trained with GRPO show significant advantages over those using SFT and preference optimization.

Computer Science > Cryptography and Security arXiv:2602.14012 (cs) [Submitted on 15 Feb 2026] Title:From SFT to RL: Demystifying the Post-Training Pipeline for LLM-based Vulnerability Detection Authors:Youpeng Li, Fuxun Yu, Xinda Wang View a PDF of the paper titled From SFT to RL: Demystifying the Post-Training Pipeline for LLM-based Vulnerability Detection, by Youpeng Li and 2 other authors View PDF HTML (experimental) Abstract:The integration of LLMs into vulnerability detection (VD) has shifted the field toward interpretable and context-aware analysis. While post-training methods have shown promise in general coding tasks, their systematic application to VD remains underexplored. In this paper, we present the first comprehensive investigation into the post-training pipeline for LLM-based VD, spanning from cold-start SFT to off-policy preference optimization and on-policy RL, uncovering how data curation, stage interactions, reward mechanisms, and evaluation protocols collectively dictate the efficacy of model training and assessment. Our study identifies practical guidelines and insights: (1) SFT based on rejection sampling greatly outperforms rationalization-based supervision, which can introduce hallucinations due to ground-truth leakage. (2) While increased SFT epochs constantly benefit preference optimization, excessive SFT inhibits self-exploration during RL, ultimately limiting performance gains. (3) Coarse-grained reward signals often mislead RL, whereas fine-gra...

Related Articles

Llms

Continuous Knowledge Transfer Between Claude and Codex

For the last 8 months I've developed strictly using Claude Code, setting up context layers, hooks, skills, etc. But relying on one model ...

Reddit - Artificial Intelligence · 1 min ·
Claude Suffered a 'Major Outage.' Anthropic Says It's Fixed.
Llms

Claude Suffered a 'Major Outage.' Anthropic Says It's Fixed.

AI Tools & Products · 3 min ·
Anthropic's latest AI model identifies 'thousands of zero-day vulnerabilities' in 'every major operating system and every major web browser' — Claude Mythos Preview sparks race to fix critical bugs, some unpatched for decades
Llms

Anthropic's latest AI model identifies 'thousands of zero-day vulnerabilities' in 'every major operating system and every major web browser' — Claude Mythos Preview sparks race to fix critical bugs, some unpatched for decades

AI Tools & Products · 6 min ·
Thinking small: How small language models could lessen the AI energy burden
Llms

Thinking small: How small language models could lessen the AI energy burden

According to researchers, for many industries, small language models may offer a host of advantages to energy- and resource-intensive lar...

AI Tools & Products · 5 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime