Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Things I got wrong building a confidence evaluator for local LLMs [D]

I've been building **Autodidact**, a local-first AI agent framework. The central piece is a **confidence evaluator** - something that dec...

Reddit - Machine Learning · 1 min · 17 minutes ago

Llms

I’m convinced 90% of you building "AI Agents" are just burning money on proxy providers. [D]

Seriously, I just audited my stack and realized I’m spending more on rotating residential proxies than I am on the actual Claude and Open...

Reddit - Machine Learning · 1 min · 17 minutes ago

Machine Learning

I recently tested Gemma 4-31B locally and I was blown away with the intelligence/size ratio of this model. These papers show how they achieved such distillation capabilities.[R]

The secret sauce here is that the student model does not just try to guess the next token in a sentence, which is how most AI is trained....

Reddit - Machine Learning · 1 min · about 2 hours ago

All Content

Llms

[2505.08548] From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

Abstract page for arXiv paper 2505.08548: From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

arXiv - AI · 4 min · 20 days ago

Machine Learning

[2505.05375] Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks

Abstract page for arXiv paper 2505.05375: Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks

arXiv - AI · 4 min · 20 days ago

Machine Learning

[2503.08751] Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

Abstract page for arXiv paper 2503.08751: Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for ...

arXiv - Machine Learning · 4 min · 20 days ago

Machine Learning

[2502.06096] Post-detection inference for sequential changepoint localization

Abstract page for arXiv paper 2502.06096: Post-detection inference for sequential changepoint localization

arXiv - AI · 3 min · 20 days ago

Machine Learning

[2412.07469] Score-matching-based Structure Learning for Temporal Data on Networks

Abstract page for arXiv paper 2412.07469: Score-matching-based Structure Learning for Temporal Data on Networks

arXiv - Machine Learning · 4 min · 20 days ago

Machine Learning

[2412.11308] From XAI to MLOps: Explainable Concept Drift Detection with Profile Drift Detection

Abstract page for arXiv paper 2412.11308: From XAI to MLOps: Explainable Concept Drift Detection with Profile Drift Detection

arXiv - Machine Learning · 4 min · 20 days ago

Machine Learning

[2411.02225] Sparse Max-Affine Regression

Abstract page for arXiv paper 2411.02225: Sparse Max-Affine Regression

arXiv - Machine Learning · 4 min · 20 days ago

Llms

[2410.14826] SPRIG: Improving Large Language Model Performance by System Prompt Optimization

Abstract page for arXiv paper 2410.14826: SPRIG: Improving Large Language Model Performance by System Prompt Optimization

arXiv - AI · 4 min · 20 days ago

Machine Learning

[2403.12072] Floralens: a Deep Learning Model for the Portuguese Native Flora

Abstract page for arXiv paper 2403.12072: Floralens: a Deep Learning Model for the Portuguese Native Flora

arXiv - Machine Learning · 4 min · 20 days ago

Machine Learning

[2302.08724] Piecewise Deterministic Markov Processes for Bayesian Neural Networks

Abstract page for arXiv paper 2302.08724: Piecewise Deterministic Markov Processes for Bayesian Neural Networks

arXiv - Machine Learning · 3 min · 20 days ago

Machine Learning

[2302.00797] Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning

Abstract page for arXiv paper 2302.00797: Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinfo...

arXiv - AI · 4 min · 20 days ago

Machine Learning

[2006.12024] Bayesian Neural Networks: An Introduction and Survey

Abstract page for arXiv paper 2006.12024: Bayesian Neural Networks: An Introduction and Survey

arXiv - Machine Learning · 3 min · 20 days ago

Machine Learning

[2603.29086] Realistic Market Impact Modeling for Reinforcement Learning Trading Environments

Abstract page for arXiv paper 2603.29086: Realistic Market Impact Modeling for Reinforcement Learning Trading Environments

arXiv - Machine Learning · 4 min · 20 days ago

Machine Learning

[2603.28942] ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Abstract page for arXiv paper 2603.28942: ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference At...

arXiv - Machine Learning · 4 min · 20 days ago

Llms

[2603.13285] Brittlebench: Quantifying LLM robustness via prompt sensitivity

Abstract page for arXiv paper 2603.13285: Brittlebench: Quantifying LLM robustness via prompt sensitivity

arXiv - AI · 4 min · 20 days ago

Machine Learning

[2603.11321] Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

Abstract page for arXiv paper 2603.11321: Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

arXiv - AI · 3 min · 20 days ago

Machine Learning

[2603.10742] A Grammar of Machine Learning Workflows

Abstract page for arXiv paper 2603.10742: A Grammar of Machine Learning Workflows

arXiv - Machine Learning · 3 min · 20 days ago

Machine Learning

[2603.06977] NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning

Abstract page for arXiv paper 2603.06977: NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning

arXiv - AI · 4 min · 20 days ago

Llms

[2602.04448] RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models

Abstract page for arXiv paper 2602.04448: RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models

arXiv - AI · 3 min · 20 days ago

Llms

[2602.01554] InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs

Abstract page for arXiv paper 2602.01554: InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenizati...

arXiv - AI · 4 min · 20 days ago

Previous Page 248 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

Things I got wrong building a confidence evaluator for local LLMs [D]

I’m convinced 90% of you building "AI Agents" are just burning money on proxy providers. [D]

I recently tested Gemma 4-31B locally and I was blown away with the intelligence/size ratio of this model. These papers show how they achieved such distillation capabilities.[R]

All Content

[2505.08548] From Seeing to Doing: Bridging Reasoning and Decision for Robotic Manipulation

[2505.05375] Threshold Modulation for Online Test-Time Adaptation of Spiking Neural Networks

[2503.08751] Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

[2502.06096] Post-detection inference for sequential changepoint localization

[2412.07469] Score-matching-based Structure Learning for Temporal Data on Networks

[2412.11308] From XAI to MLOps: Explainable Concept Drift Detection with Profile Drift Detection

[2411.02225] Sparse Max-Affine Regression

[2410.14826] SPRIG: Improving Large Language Model Performance by System Prompt Optimization

[2403.12072] Floralens: a Deep Learning Model for the Portuguese Native Flora

[2302.08724] Piecewise Deterministic Markov Processes for Bayesian Neural Networks

[2302.00797] Combining Tree-Search, Generative Models, and Nash Bargaining Concepts in Game-Theoretic Reinforcement Learning

[2006.12024] Bayesian Neural Networks: An Introduction and Survey

[2603.29086] Realistic Market Impact Modeling for Reinforcement Learning Trading Environments

[2603.28942] ReproMIA: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

[2603.13285] Brittlebench: Quantifying LLM robustness via prompt sensitivity

[2603.11321] Hindsight-Anchored Policy Optimization: Turning Failure into Feedback in Sparse Reward Settings

[2603.10742] A Grammar of Machine Learning Workflows

[2603.06977] NePPO: Near-Potential Policy Optimization for General-Sum Multi-Agent Reinforcement Learning

[2602.04448] RASA: Routing-Aware Safety Alignment for Mixture-of-Experts Models

[2602.01554] InfoTok: Information-Theoretic Regularization for Capacity-Constrained Shared Visual Tokenization in Unified MLLMs

Related Topics

Stay updated with AI News