Llms Machine Learning Ai Agents

[2602.22124] SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents

arXiv - Machine Learning February 26, 2026 4 min read Article

Summary

The paper presents SWE-Protégé, a framework that enhances small language models (SLMs) for software engineering tasks by enabling selective collaboration with expert models, improving performance significantly.

Why It Matters

This research addresses the limitations of small language models in software engineering by introducing a novel approach that combines expert guidance with reinforcement learning, potentially transforming how SLMs are utilized in practical applications.

Key Takeaways

SWE-Protégé improves small language models' performance on software engineering tasks.
The framework allows SLMs to selectively seek expert guidance, enhancing decision-making.
A significant performance increase of 25.4% was achieved on SWE-bench using this approach.

Computer Science > Software Engineering arXiv:2602.22124 (cs) [Submitted on 25 Feb 2026] Title:SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents Authors:Patrick Tser Jern Kon, Archana Pradeep, Ang Chen, Alexander P. Ellis, Warren Hunt, Zijian Wang, John Yang, Samuel Thompson View a PDF of the paper titled SWE-Prot\'eg\'e: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents, by Patrick Tser Jern Kon and 7 other authors View PDF HTML (experimental) Abstract:Small language models (SLMs) offer compelling advantages in cost, latency, and adaptability, but have so far lagged behind larger models on long-horizon software engineering tasks such as SWE-bench, where they suffer from pervasive action looping and low resolution rates. We introduce SWE-Protégé, a post-training framework that reframes software repair as an expert-protégé collaboration problem. In SWE-Protégé, an SLM remains the sole decision-maker while learning to selectively seek guidance from a strong expert model, recognize stalled states, and follow through on expert feedback. Our approach combines supervised fine-tuning on expert-augmented trajectories with agentic reinforcement learning that explicitly discourages degenerative looping and unproductive expert collaboration. We lightly post-train Qwen2.5-Coder-7B-Instruct to achieve 42.4% Pass@1 on SWE-bench Verified, a +25.4% impro...

Read Original Article

[2602.22124] SWE-Protégé: Learning to Selectively Collaborate With an Expert Unlocks Small Language Models as Software Engineering Agents

Summary

Why It Matters

Key Takeaways

Related Articles

[2603.17839] How do LLMs Compute Verbal Confidence

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

[2603.10062] Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

[2603.09085] Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

No comments

Stay updated with AI News