Things I got wrong building a confidence evaluator for local LLMs [D]
I've been building **Autodidact**, a local-first AI agent framework. The central piece is a **confidence evaluator** - something that dec...
ML algorithms, training, and inference
I've been building **Autodidact**, a local-first AI agent framework. The central piece is a **confidence evaluator** - something that dec...
Seriously, I just audited my stack and realized I’m spending more on rotating residential proxies than I am on the actual Claude and Open...
The secret sauce here is that the student model does not just try to guess the next token in a sentence, which is how most AI is trained....
Abstract page for arXiv paper 2505.08548: From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation
Abstract page for arXiv paper 2505.05375: Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks
Abstract page for arXiv paper 2503.08751: Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for ...
Abstract page for arXiv paper 2502.06096: Post-detection inference for sequential changepoint localization
Abstract page for arXiv paper 2412.07469: Score-matching-based Structure Learning for Temporal Data on Networks
Abstract page for arXiv paper 2412.11308: From XAI to MLOps: Explainable Concept Drift Detection with Profile Drift Detection
Abstract page for arXiv paper 2411.02225: Sparse Max-Affine Regression
Abstract page for arXiv paper 2410.14826: SPRIG: Improving Large Language Model Performance by System Prompt Optimization
Abstract page for arXiv paper 2403.12072: Floralens: a Deep Learning Model for the Portuguese Native Flora
Abstract page for arXiv paper 2302.08724: Piecewise Deterministic Markov Processes for Bayesian Neural Networks
Abstract page for arXiv paper 2302.00797: Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinfo...
Abstract page for arXiv paper 2006.12024: Bayesian Neural Networks: An Introduction and Survey
Abstract page for arXiv paper 2603.29086: Realistic Market Impact Modeling for Reinforcement Learning Trading Environments
Abstract page for arXiv paper 2603.28942: ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference At...
Abstract page for arXiv paper 2603.13285: Brittlebench: Quantifying LLM robustness via prompt sensitivity
Abstract page for arXiv paper 2603.11321: Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings
Abstract page for arXiv paper 2603.10742: A Grammar of Machine Learning Workflows
Abstract page for arXiv paper 2603.06977: NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning
Abstract page for arXiv paper 2602.04448: RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models
Abstract page for arXiv paper 2602.01554: InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenizati...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime