Measuring Open-Source Llama Nemotron Models on DeepResearch Bench
About this article
A Blog post by NVIDIA on Hugging Face
Back to Articles Measuring Open-Source Llama Nemotron Models on DeepResearch Bench Enterprise + Article Published August 4, 2025 Upvote 5 Jay Rodge jayrodge Follow nvidia Contributors: David Austin, Raja Biswas, Gilberto Titericz Junior, NVIDIA NVIDIA’s AI-Q Blueprint—the leading portable, open deep research agent—recently climbed to the top of the Hugging Face “LLM with Search” leaderboard on DeepResearch Bench. This is a significant step forward for the open-source AI stack, proving that developer-accessible models can power advanced agentic workflows that rival or surpass closed alternatives. What sets AI-Q apart? It fuses two high-performance open LLMs—Llama 3.3-70B Instruct and Llama-3.3-Nemotron-Super-49B-v1.5—to orchestrate long-context retrieval, agentic reasoning, and robust synthesis. Core Stack: Model Choices and Technical Innovations Llama 3.3-70B Instruct: The foundation for fluent, structured report generation, derived from Meta’s Llama series and open-licensed for unrestricted deployment. Llama-3.3-Nemotron-Super-49B-v1.5: An optimized, reasoning-focused variant. Built via Neural Architecture Search (NAS), knowledge distillation, and successive rounds of supervised and reinforcement learning, it excels at multi-step reasoning, query planning, tool use, and reflection—all with a reduced memory footprint for efficient deployment on standard GPUs. The AI-Q reference example also includes: NVIDIA NeMo Retriever for scalable, multimodal search (internal+external)...