[D] Tested model routing on financial AI datasets — good savings and curious what benchmarks others use.
About this article
Ran a benchmark evaluating whether prompt complexity-based routing delivers meaningful savings. Used public HuggingFace datasets. Here's what I found. Setup Baseline: Claude Opus for everything. Tested two strategies: Intra-provider — routes within same provider by complexity. Simple → Haiku, Medium → Sonnet, Complex → Opus Flexible — medium prompts go to self-hosted Qwen 3.5 27B / Gemma 3 27B. Complex always stays on Opus Datasets used All from AdaptLLM/finance-tasks on HuggingFace: FiQA-SA ...
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket