Things I got wrong building a confidence evaluator for local LLMs [D]
About this article
I've been building **Autodidact**, a local-first AI agent framework. The central piece is a **confidence evaluator** - something that decides whether a small local model (Qwen 2.5 7B, Llama 3.1 8B, Mistral 7B) can answer a question, or whether to escalate to a cloud model. Autodidact is still a project in development. I'll open-source the repo once v0.1 is stable enough for external eyes - until then, this post is the current state of the experiments. If the confidence evaluator works, you ge...
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket