[2602.22617] Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA

[2602.22617] Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA

arXiv - Machine Learning 3 min read Article

Summary

The paper introduces Semantic Tube Prediction (STP), a method that enhances data efficiency in large language models (LLMs) by constraining hidden-state trajectories, allowing models to achieve baseline accuracy with significantly less training data.

Why It Matters

This research challenges existing scaling laws in LLMs by demonstrating that geometric priors can lead to improved data efficiency. As data costs rise, finding methods to optimize training processes is crucial for advancing AI capabilities without requiring massive datasets.

Key Takeaways

  • STP allows LLMs to maintain accuracy with 16x less training data.
  • The Geodesic Hypothesis posits that token sequences follow geodesics on a semantic manifold.
  • STP improves signal-to-noise ratio and preserves diversity during inference.
  • The method challenges traditional data-efficiency bounds in LLM training.
  • Code for the proposed method is publicly available for further research.

Computer Science > Machine Learning arXiv:2602.22617 (cs) [Submitted on 26 Feb 2026] Title:Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA Authors:Hai Huang, Yann LeCun, Randall Balestriero View a PDF of the paper titled Semantic Tube Prediction: Beating LLM Data Efficiency with JEPA, by Hai Huang and 2 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) obey consistent scaling laws -- empirical power-law fits that predict how loss decreases with compute, data, and parameters. While predictive, these laws are descriptive rather than prescriptive: they characterize typical training, not optimal training. Surprisingly few works have successfully challenged the data-efficiency bounds implied by these laws -- which is our primary focus. To that end, we introduce the Geodesic Hypothesis, positing that token sequences trace geodesics on a smooth semantic manifold and are therefore locally linear. Building on this principle, we propose a novel Semantic Tube Prediction (STP) task, a JEPA-style regularizer that confines hidden-state trajectories to a tubular neighborhood of the geodesic. STP generalizes JEPA to language without requiring explicit multi-view augmentations. We show this constraint improves signal-to-noise ratio, and consequently preserves diversity by preventing trajectory collisions during inference. Empirically, STP allows LLMs to match baseline accuracy with 16$\times$ less training data on the NL-RX-SYNTH dataset, dire...

Related Articles

Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min ·
Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Shifting to AI model customization is an architectural imperative | MIT Technology Review
Llms

Shifting to AI model customization is an architectural imperative | MIT Technology Review

In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every ...

MIT Technology Review · 6 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime