[2602.14578] RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch
Summary
The paper presents RNM-TD3, a novel approach to reinforcement learning that employs N:M structured sparsity, enhancing performance while maintaining hardware efficiency.
Why It Matters
This research addresses the challenge of balancing model compression and performance in deep reinforcement learning. By introducing N:M structured sparsity, it opens new avenues for efficient training and deployment of RL models, which is crucial for real-time applications and resource-constrained environments.
Key Takeaways
- RNM-TD3 achieves significant performance improvements at high sparsity levels.
- The method maintains compatibility with hardware that supports N:M sparse operations.
- Structured sparsity can lead to faster training times without sacrificing model accuracy.
Computer Science > Machine Learning arXiv:2602.14578 (cs) [Submitted on 16 Feb 2026] Title:RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch Authors:Isam Vrce, Andreas Kassler, Gökçe Aydos View a PDF of the paper titled RNM-TD3: N:M Semi-structured Sparse Reinforcement Learning From Scratch, by Isam Vrce and 1 other authors View PDF HTML (experimental) Abstract:Sparsity is a well-studied technique for compressing deep neural networks (DNNs) without compromising performance. In deep reinforcement learning (DRL), neural networks with up to 5% of their original weights can still be trained with minimal performance loss compared to their dense counterparts. However, most existing methods rely on unstructured fine-grained sparsity, which limits hardware acceleration opportunities due to irregular computation patterns. Structured coarse-grained sparsity enables hardware acceleration, yet typically degrades performance and increases pruning complexity. In this work, we present, to the best of our knowledge, the first study on N:M structured sparsity in RL, which balances compression, performance, and hardware efficiency. Our framework enforces row-wise N:M sparsity throughout training for all networks in off-policy RL (TD3), maintaining compatibility with accelerators that support N:M sparse matrix operations. Experiments on continuous-control benchmarks show that RNM-TD3, our N:M sparse agent, outperforms its dense counterpart at 50%-75% sparsity (e.g., 2:4...