[2603.02237] Concept Heterogeneity-aware Representation Steering
About this article
Abstract page for arXiv paper 2603.02237: Concept Heterogeneity-aware Representation Steering
Computer Science > Machine Learning arXiv:2603.02237 (cs) [Submitted on 13 Feb 2026] Title:Concept Heterogeneity-aware Representation Steering Authors:Laziz U. Abdullaev, Noelle Y. L. Wong, Ryan T. Z. Lee, Shiqi Jiang, Khoi N. M. Nguyen, Tan M. Nguyen View a PDF of the paper titled Concept Heterogeneity-aware Representation Steering, by Laziz U. Abdullaev and 5 other authors View PDF HTML (experimental) Abstract:Representation steering offers a lightweight mechanism for controlling the behavior of large language models (LLMs) by intervening on internal activations at inference time. Most existing methods rely on a single global steering direction, typically obtained via difference-in-means over contrastive datasets. This approach implicitly assumes that the target concept is homogeneously represented across the embedding space. In practice, however, LLM representations can be highly non-homogeneous, exhibiting clustered, context-dependent structure, which renders global steering directions brittle. In this work, we view representation steering through the lens of optimal transport (OT), noting that standard difference-in-means steering implicitly corresponds to the OT map between two unimodal Gaussian distributions with identical covariance, yielding a global translation. To relax this restrictive assumption, we theoretically model source and target representations as Gaussian mixture models and formulate steering as a discrete OT problem between semantic latent clusters. ...