[2509.22626] Learning Admissible Heuristics for A*: Theory and Practice
Summary
This paper explores learning admissible heuristics for the A* search algorithm, introducing a new loss function that ensures admissibility during training, enhancing performance in domains like the Rubik's Cube.
Why It Matters
The study addresses limitations in current heuristic learning methods, emphasizing the importance of admissibility for optimal solutions in search algorithms. By proposing a novel approach, it contributes to advancements in machine learning and artificial intelligence, particularly in optimizing search strategies.
Key Takeaways
- Introduces Cross-Entropy Admissibility (CEA) as a loss function for training heuristics.
- Demonstrates significant performance improvements over traditional compressed pattern database heuristics.
- Establishes theoretical bounds on sample complexity for learning heuristics in A*.
- Provides generalization guarantees for goal-dependent heuristics using neural networks.
- Highlights the relevance of admissibility in ensuring optimal solutions in search algorithms.
Computer Science > Machine Learning arXiv:2509.22626 (cs) [Submitted on 26 Sep 2025 (v1), last revised 17 Feb 2026 (this version, v2)] Title:Learning Admissible Heuristics for A*: Theory and Practice Authors:Ehsan Futuhi, Nathan R. Sturtevant View a PDF of the paper titled Learning Admissible Heuristics for A*: Theory and Practice, by Ehsan Futuhi and 1 other authors View PDF HTML (experimental) Abstract:Heuristic functions are central to the performance of search algorithms such as A-star, where admissibility - the property of never overestimating the true shortest-path cost - guarantees solution optimality. Recent deep learning approaches often disregard admissibility and provide limited guarantees on generalization beyond the training data. This paper addresses both of these limitations. First, we pose heuristic learning as a constrained optimization problem and introduce Cross-Entropy Admissibility (CEA), a loss function that enforces admissibility during training. On the Rubik's Cube domain, this method yields near-admissible heuristics with significantly stronger guidance than compressed pattern database (PDB) heuristics. Theoretically, we study the sample complexity of learning heuristics. By leveraging PDB abstractions and the structural properties of graphs such as the Rubik's Cube, we tighten the bound on the number of training samples needed for A-star to generalize. Replacing a general hypothesis class with a ReLU neural network gives bounds that depend primari...