[2602.13665] HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating

[2602.13665] HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating

arXiv - AI 4 min read Article

Summary

The paper presents HyFunc, a framework designed to enhance the efficiency of LLM-based function calls in agentic AI by reducing computational redundancies through a hybrid-model cascade and dynamic templating.

Why It Matters

As AI systems increasingly rely on large language models (LLMs) for real-time applications, optimizing function calls is crucial. HyFunc addresses significant inefficiencies in existing models, thus improving performance and reducing latency, which is vital for the advancement of agentic AI technologies.

Key Takeaways

  • HyFunc reduces inference latency to 0.828 seconds, outperforming existing models.
  • The framework uses a hybrid-model cascade to streamline function call processing.
  • Dynamic templating minimizes syntactic redundancy in function calls.
  • HyFunc maintains high performance, achieving 80.1% accuracy on benchmark datasets.
  • The code for HyFunc is publicly available, promoting further research and development.

Computer Science > Artificial Intelligence arXiv:2602.13665 (cs) [Submitted on 14 Feb 2026] Title:HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating Authors:Weibin Liao, Jian-guang Lou, Haoyi Xiong View a PDF of the paper titled HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating, by Weibin Liao and 2 other authors View PDF HTML (experimental) Abstract:While agentic AI systems rely on LLMs to translate user intent into structured function calls, this process is fraught with computational redundancy, leading to high inference latency that hinders real-time applications. This paper identifies and addresses three key redundancies: (1) the redundant processing of a large library of function descriptions for every request; (2) the redundant use of a large, slow model to generate an entire, often predictable, token sequence; and (3) the redundant generation of fixed, boilerplate parameter syntax. We introduce HyFunc, a novel framework that systematically eliminates these inefficiencies. HyFunc employs a hybrid-model cascade where a large model distills user intent into a single "soft token." This token guides a lightweight retriever to select relevant functions and directs a smaller, prefix-tuned model to generate the final call, thus avoiding redundant context processing and full-sequence generation by the large model. To eliminate syntactic redundancy, our...

Related Articles

Llms

Claude on Claude

The Story of Anthropic’s Latest Controversies Regarding the Business of Its Prized Creation… As Told by the Thing Itself. Editor’s note: ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Cut Claude usage by ~85% in a job search pipeline (16k → 900 tokens/app) — here’s what worked

Like many here, I kept running into Claude usage limits when building anything non-trivial. I was working with a job search automation pi...

Reddit - Artificial Intelligence · 1 min ·
Llms

"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment

Posted today in light of the Claude Mythos model card release. Originally I wrote this for r/ControlProblem but realized it was getting o...

Reddit - Artificial Intelligence · 1 min ·
Llms

AI joins the 8-hour work day as GLM ships 5.1 open source LLM, beating Opus 4.6 and GPT-5.4 on SWE-Bench Pro

AI Tools & Products ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime