[2602.24210] Controllable Reasoning Models Are Private Thinkers
About this article
Abstract page for arXiv paper 2602.24210: Controllable Reasoning Models Are Private Thinkers
Computer Science > Computation and Language arXiv:2602.24210 (cs) [Submitted on 27 Feb 2026] Title:Controllable Reasoning Models Are Private Thinkers Authors:Haritz Puerto, Haonan Li, Xudong Han, Timothy Baldwin, Iryna Gurevych View a PDF of the paper titled Controllable Reasoning Models Are Private Thinkers, by Haritz Puerto and 4 other authors View PDF HTML (experimental) Abstract:AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result in the unintended leakage of private information to external parties. We propose training models to follow instructions not only in the final answer, but also in reasoning traces, potentially under different constraints. We hypothesize that improving their instruction following abilities in the reasoning traces can improve their privacy-preservation skills. To demonstrate this, we fine-tune models on a new instruction-following dataset with explicit restrictions on reasoning traces. We further introduce a generation strategy that decouples reasoning and answer generation using separate LoRA adapters. We evaluate our approach on six models from two model families, ranging from 1.7B to 14B parameters, across two instruction-following benchmarks and two privacy benchmarks. Our method yields substantial improvements, achieving gains of up to 20.9 points in instruction-following performance and up to 51.9 percentage points on privacy benchmarks. Thes...