[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

arXiv - Machine Learning 4 min read Article

Summary

This article presents rBridge, a small proxy model that predicts reasoning performance in large language models (LLMs), demonstrating significant cost and efficiency benefits in dataset optimization.

Why It Matters

As the demand for large language models grows, optimizing their training processes becomes crucial. This study highlights a method to leverage smaller models for effective reasoning predictions, potentially reducing costs and improving accessibility in AI research and applications.

Key Takeaways

  • rBridge effectively predicts reasoning performance in LLMs using small proxy models.
  • The model reduces dataset ranking costs by over 100x compared to existing methods.
  • It shows strong correlation across multiple reasoning benchmarks, enhancing predictive accuracy.
  • Zero-shot transfer capabilities allow rBridge to apply insights across different pre-training datasets.
  • This approach offers a practical solution for cost-effective reasoning-oriented pre-training.

Computer Science > Machine Learning arXiv:2509.21013 (cs) [Submitted on 25 Sep 2025 (v1), last revised 26 Feb 2026 (this version, v3)] Title:Predicting LLM Reasoning Performance with Small Proxy Model Authors:Woosung Koh, Juyoung Suk, Sungjun Han, Se-Young Yun, Jamin Shin View a PDF of the paper titled Predicting LLM Reasoning Performance with Small Proxy Model, by Woosung Koh and 4 other authors View PDF HTML (experimental) Abstract:Given the prohibitive cost of pre-training large language models, it is essential to leverage smaller proxy models to optimize datasets before scaling up. However, this approach becomes challenging for reasoning capabilities, which exhibit emergent behavior that only appear reliably at larger model sizes, often exceeding 7B parameters. To address this, we introduce rBridge, showing that small proxies ($\leq$1B) can effectively predict large-model reasoning by aligning more closely with (1) the pre-training objective and (2) the target task. rBridge achieves this by weighting negative log-likelihood with task alignment, using reasoning traces from frontier models as gold labels. In our experiments, rBridge (i) reduces dataset ranking costs by over 100x relative to the best baseline, (ii) achieves the strongest correlation across six reasoning benchmarks at 1B to 32B scale, and (iii) zero-shot transfers predictive relationships across pre-training datasets at 1B to 7B scale. These findings indicate that rBridge offers a practical path for explor...

Related Articles

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch
Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min ·
Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min ·
Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·
Llms

we open sourced a tool that auto generates your AI agent context from your actual codebase, just hit 250 stars

hey everyone. been lurking here for a while and wanted to share something we been building. the problem: ai coding agents are only as goo...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime