[2510.15165] Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach
About this article
Abstract page for arXiv paper 2510.15165: Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach
Computer Science > Machine Learning arXiv:2510.15165 (cs) [Submitted on 16 Oct 2025 (v1), last revised 3 Mar 2026 (this version, v3)] Title:Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach Authors:Xin Guo, Zijiu Lyu View a PDF of the paper titled Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach, by Xin Guo and 1 other authors View PDF HTML (experimental) Abstract:This paper studies policy transfer, one of the well-known transfer learning techniques adopted in large language models, for continuous-time reinforcement learning problems. In the case of continuous-time linear-quadratic systems with Shannon's entropy regularization, we fully exploit the Gaussian structure of their optimal policy and the stability of their associated Riccati equations. In the general case where the system has possibly non-linear and bounded dynamics, the key technical component is the stability of diffusion SDEs which is established by invoking the rough path theory. Our work provides the first theoretical proof of policy transfer for continuous-time RL: an optimal policy learned for one RL problem can be used to initialize to search for a near-optimal policy for another closely related RL problem, while achieving (at least) the same rate of convergence for the original algorithm. As a byproduct of our analysis, we derive the stability of a concrete class of continuous-time score-based diffusio...