[2510.06649] Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions
About this article
Abstract page for arXiv paper 2510.06649: Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions
Computer Science > Machine Learning arXiv:2510.06649 (cs) [Submitted on 8 Oct 2025 (v1), last revised 3 Apr 2026 (this version, v2)] Title:Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions Authors:Frank Wu, Mengye Ren View a PDF of the paper titled Local Reinforcement Learning with Action-Conditioned Root Mean Squared Q-Functions, by Frank Wu and Mengye Ren View PDF HTML (experimental) Abstract:The Forward-Forward (FF) Algorithm is a recently proposed learning procedure for neural networks that employs two forward passes instead of the traditional forward and backward passes used in backpropagation. However, FF remains largely confined to supervised settings, leaving a gap at domains where learning signals can be yielded more naturally such as RL. In this work, inspired by FF's goodness function using layer activity statistics, we introduce Action-conditioned Root mean squared Q-Functions (ARQ), a novel value estimation method that applies a goodness function and action conditioning for local RL using temporal difference learning. Despite its simplicity and biological grounding, our approach achieves superior performance compared to state-of-the-art local backprop-free RL methods in the MinAtar and the DeepMind Control Suite benchmarks, while also outperforming algorithms trained with backpropagation on most tasks. Code can be found at this https URL. Comments: Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI) Cite as: arX...