[2602.17098] Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization
Summary
This article presents a comparative study of Deep Reinforcement Learning (DRL) and Mean-Variance Optimization (MVO) for optimal portfolio allocation, highlighting the strengths and practical applications of DRL in finance.
Why It Matters
As financial markets evolve, traditional methods like MVO may not suffice for optimal portfolio management. This study provides insights into how DRL can enhance investment strategies, potentially leading to better risk-adjusted returns. Understanding these methods is crucial for finance professionals and AI researchers alike.
Key Takeaways
- DRL shows strong performance compared to traditional MVO in portfolio allocation.
- The study emphasizes practical adjustments needed for implementing DRL in finance.
- Backtesting results indicate improved metrics such as Sharpe ratio and absolute returns with DRL.
Quantitative Finance > Portfolio Management arXiv:2602.17098 (q-fin) [Submitted on 19 Feb 2026] Title:Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization Authors:Srijan Sood, Kassiani Papasotiriou, Marius Vaiciulis, Tucker Balch View a PDF of the paper titled Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization, by Srijan Sood and 3 other authors View PDF HTML (experimental) Abstract:Portfolio Management is the process of overseeing a group of investments, referred to as a portfolio, with the objective of achieving predetermined investment goals. Portfolio optimization is a key component that involves allocating the portfolio assets so as to maximize returns while minimizing risk taken. It is typically carried out by financial professionals who use a combination of quantitative techniques and investment expertise to make decisions about the portfolio allocation. Recent applications of Deep Reinforcement Learning (DRL) have shown promising results when used to optimize portfolio allocation by training model-free agents on historical market data. Many of these methods compare their results against basic benchmarks or other state-of-the-art DRL agents but often fail to compare their performance against traditional methods used by financial professionals in practical settings. One of the most commonly used methods for this task is Mean-Variance Portfolio O...