[2602.15669] PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra
Summary
The paper introduces PERSONA, a novel framework for dynamic personality control in Large Language Models (LLMs) using activation vector algebra, achieving results comparable to fine-tuning without requiring gradient updates.
Why It Matters
This research addresses the limitations of current personality control methods in LLMs, which often rely on static approaches. By demonstrating that personality traits can be manipulated algebraically in the model's representation space, it opens new avenues for more interpretable and efficient AI behavior control, enhancing user interaction with AI systems.
Key Takeaways
- PERSONA framework enables dynamic personality control without fine-tuning.
- Personality traits can be represented as orthogonal vectors in activation space.
- The approach achieves high performance on benchmarks, nearing supervised fine-tuning results.
- Vector arithmetic allows for precise control over personality traits.
- The findings suggest new methodologies for interpretable AI behavior.
Computer Science > Artificial Intelligence arXiv:2602.15669 (cs) [Submitted on 17 Feb 2026] Title:PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra Authors:Xiachong Feng, Liang Zhao, Weihong Zhong, Yichong Huang, Yuxuan Gu, Lingpeng Kong, Xiaocheng Feng, Bing Qin View a PDF of the paper titled PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra, by Xiachong Feng and 7 other authors View PDF HTML (experimental) Abstract:Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves fine-tuning level performance through direct manipulation of personality vectors in activation space. Our key insight is that personality traits appear as extractable, approximately orthogonal directions in the model's representation space that support algebraic operations. The framework operates through three stages: Persona-Base extracts orthogonal trait vectors via contrastive activation analysis; Persona-Algebra enables precise control through vector arithmetic (scalar multiplication for intensity, addition for composition, subtraction for suppression); and Persona-Flow achieves context-aware adaptation by dynamically composing these vectors during inference. On PersonalityBench, our approach achieves a mean sco...