[2602.17835] Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning

[2602.17835] Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning

arXiv - Machine Learning 4 min read Article

Summary

The paper presents Iprox, a two-stage framework for gradient-based data selection in LLM fine-tuning, which constructs influence-preserving proxies to enhance computational efficiency and model performance.

Why It Matters

As large language models (LLMs) grow in size, efficient data selection methods become crucial for optimizing performance. Iprox addresses the limitations of existing proxy models, providing a scalable solution that retains the influence of the target model, thereby improving fine-tuning processes in machine learning.

Key Takeaways

  • Iprox offers a two-stage framework for constructing influence-preserving proxies.
  • The method significantly reduces computational costs while maintaining model performance.
  • Experimental results show Iprox outperforms off-the-shelf proxies and baseline methods.
  • The framework is adaptable across different LLM families and evaluation tasks.
  • Iprox enhances the scalability of gradient-based data selection in LLM fine-tuning.

Computer Science > Machine Learning arXiv:2602.17835 (cs) [Submitted on 19 Feb 2026] Title:Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning Authors:Sirui Chen, Yunzhe Qi, Mengting Ai, Yifan Sun, Ruizhong Qiu, Jiaru Zou, Jingrui He View a PDF of the paper titled Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning, by Sirui Chen and 6 other authors View PDF HTML (experimental) Abstract:Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational cost scales poorly, making them impractical for multi-billion-parameter large language models (LLMs). A common alternative is to use off-the-shelf smaller models as proxies, but they remain suboptimal since their learning dynamics are unclear, their sizes cannot be flexibly adjusted, and they cannot be further aligned with the target model in terms of gradient-based influence estimation. To address these challenges, we introduce Iprox, a two-stage framework that derives influence-preserving proxies directly from the target model. It first applies a low-rank compression stage to preserve influence information of the target model, and then an aligning stage to align both model gradients and logits, thereby constructing proxies that flexibly control computational c...

Related Articles

Llms

What if Claude purposefully made its own code leakable so that it would get leaked

What if Claude leaked itself by socially and architecturally engineering itself to be leaked by a dumb human submitted by /u/smurfcsgoawp...

Reddit - Artificial Intelligence · 1 min ·
Llms

Observer-Embedded Reality

Observer-Embedded Reality Consciousness, Complexity, Meaning, and the Limits of Human Knowledge A Conceptual Philosophy-of-Science Paper ...

Reddit - Artificial Intelligence · 1 min ·
Llms

I think we’re about to have a new kind of “SEO”… and nobody is talking about it.

More people are asking ChatGPT things like: “what’s the best CRM?” “is this tool worth it?” “alternatives to X” And they just… trust the ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Why would Claude give me the same response over and over and give others different replies?

I asked Claude to "generate me a random word" so I could do some word play. Then I asked it again in a new prompt window on desktop after...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime