[2604.04808] Selecting Decision-Relevant Concepts in Reinforcement Learning
About this article
Abstract page for arXiv paper 2604.04808: Selecting Decision-Relevant Concepts in Reinforcement Learning
Computer Science > Machine Learning arXiv:2604.04808 (cs) [Submitted on 6 Apr 2026] Title:Selecting Decision-Relevant Concepts in Reinforcement Learning Authors:Naveen Raman, Stephanie Milani, Fei Fang View a PDF of the paper titled Selecting Decision-Relevant Concepts in Reinforcement Learning, by Naveen Raman and 2 other authors View PDF HTML (experimental) Abstract:Training interpretable concept-based policies requires practitioners to manually select which human-understandable concepts an agent should reason with when making sequential decisions. This selection demands domain expertise, is time-consuming and costly, scales poorly with the number of candidates, and provides no performance guarantees. To overcome this limitation, we propose the first algorithms for principled automatic concept selection in sequential decision-making. Our key insight is that concept selection can be viewed through the lens of state abstraction: intuitively, a concept is decision-relevant if removing it would cause the agent to confuse states that require different actions. As a result, agents should rely on decision-relevant concepts; states with the same concept representation should share the same optimal action, which preserves the optimal decision structure of the original state space. This perspective leads to the Decision-Relevant Selection (DRS) algorithm, which selects a subset of concepts from a candidate set, along with performance bounds relating the selected concepts to the pe...