[2602.17835] Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning
Summary
The paper presents Iprox, a two-stage framework for gradient-based data selection in LLM fine-tuning, which constructs influence-preserving proxies to enhance computational efficiency and model performance.
Why It Matters
As large language models (LLMs) grow in size, efficient data selection methods become crucial for optimizing performance. Iprox addresses the limitations of existing proxy models, providing a scalable solution that retains the influence of the target model, thereby improving fine-tuning processes in machine learning.
Key Takeaways
- Iprox offers a two-stage framework for constructing influence-preserving proxies.
- The method significantly reduces computational costs while maintaining model performance.
- Experimental results show Iprox outperforms off-the-shelf proxies and baseline methods.
- The framework is adaptable across different LLM families and evaluation tasks.
- Iprox enhances the scalability of gradient-based data selection in LLM fine-tuning.
Computer Science > Machine Learning arXiv:2602.17835 (cs) [Submitted on 19 Feb 2026] Title:Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning Authors:Sirui Chen, Yunzhe Qi, Mengting Ai, Yifan Sun, Ruizhong Qiu, Jiaru Zou, Jingrui He View a PDF of the paper titled Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning, by Sirui Chen and 6 other authors View PDF HTML (experimental) Abstract:Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational cost scales poorly, making them impractical for multi-billion-parameter large language models (LLMs). A common alternative is to use off-the-shelf smaller models as proxies, but they remain suboptimal since their learning dynamics are unclear, their sizes cannot be flexibly adjusted, and they cannot be further aligned with the target model in terms of gradient-based influence estimation. To address these challenges, we introduce Iprox, a two-stage framework that derives influence-preserving proxies directly from the target model. It first applies a low-rank compression stage to preserve influence information of the target model, and then an aligning stage to align both model gradients and logits, thereby constructing proxies that flexibly control computational c...