[2602.13665] HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating
Summary
The paper presents HyFunc, a framework designed to enhance the efficiency of LLM-based function calls in agentic AI by reducing computational redundancies through a hybrid-model cascade and dynamic templating.
Why It Matters
As AI systems increasingly rely on large language models (LLMs) for real-time applications, optimizing function calls is crucial. HyFunc addresses significant inefficiencies in existing models, thus improving performance and reducing latency, which is vital for the advancement of agentic AI technologies.
Key Takeaways
- HyFunc reduces inference latency to 0.828 seconds, outperforming existing models.
- The framework uses a hybrid-model cascade to streamline function call processing.
- Dynamic templating minimizes syntactic redundancy in function calls.
- HyFunc maintains high performance, achieving 80.1% accuracy on benchmark datasets.
- The code for HyFunc is publicly available, promoting further research and development.
Computer Science > Artificial Intelligence arXiv:2602.13665 (cs) [Submitted on 14 Feb 2026] Title:HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating Authors:Weibin Liao, Jian-guang Lou, Haoyi Xiong View a PDF of the paper titled HyFunc: Accelerating LLM-based Function Calls for Agentic AI through Hybrid-Model Cascade and Dynamic Templating, by Weibin Liao and 2 other authors View PDF HTML (experimental) Abstract:While agentic AI systems rely on LLMs to translate user intent into structured function calls, this process is fraught with computational redundancy, leading to high inference latency that hinders real-time applications. This paper identifies and addresses three key redundancies: (1) the redundant processing of a large library of function descriptions for every request; (2) the redundant use of a large, slow model to generate an entire, often predictable, token sequence; and (3) the redundant generation of fixed, boilerplate parameter syntax. We introduce HyFunc, a novel framework that systematically eliminates these inefficiencies. HyFunc employs a hybrid-model cascade where a large model distills user intent into a single "soft token." This token guides a lightweight retriever to select relevant functions and directs a smaller, prefix-tuned model to generate the final call, thus avoiding redundant context processing and full-sequence generation by the large model. To eliminate syntactic redundancy, our...