[2602.15727] Spanning the Visual Analogy Space with a Weight Basis of LoRAs
Summary
The paper presents LoRWeB, a novel approach to visual analogy learning that enhances image manipulation by dynamically selecting and weighing Low-Rank Adaptation (LoRA) modules for improved generalization in unseen transformations.
Why It Matters
This research addresses limitations in current visual analogy learning methods by introducing a flexible framework that allows for better generalization across diverse visual transformations. The findings could significantly impact fields such as computer vision and AI-driven image processing, enhancing user interaction with visual content.
Key Takeaways
- LoRWeB enables dynamic composition of learned transformation primitives for visual analogy tasks.
- The method improves generalization to unseen visual transformations compared to existing techniques.
- A learnable basis of LoRA modules is introduced to span various visual transformations effectively.
- The approach demonstrates state-of-the-art performance in comprehensive evaluations.
- Findings suggest that LoRA basis decompositions are a promising direction for flexible visual manipulation.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.15727 (cs) [Submitted on 17 Feb 2026] Title:Spanning the Visual Analogy Space with a Weight Basis of LoRAs Authors:Hila Manor, Rinon Gal, Haggai Maron, Tomer Michaeli, Gal Chechik View a PDF of the paper titled Spanning the Visual Analogy Space with a Weight Basis of LoRAs, by Hila Manor and 4 other authors View PDF HTML (experimental) Abstract:Visual analogy learning enables image manipulation through demonstration rather than textual description, allowing users to specify complex transformations difficult to articulate in words. Given a triplet $\{\mathbf{a}$, $\mathbf{a}'$, $\mathbf{b}\}$, the goal is to generate $\mathbf{b}'$ such that $\mathbf{a} : \mathbf{a}' :: \mathbf{b} : \mathbf{b}'$. Recent methods adapt text-to-image models to this task using a single Low-Rank Adaptation (LoRA) module, but they face a fundamental limitation: attempting to capture the diverse space of visual transformations within a fixed adaptation module constrains generalization capabilities. Inspired by recent work showing that LoRAs in constrained domains span meaningful, interpolatable semantic spaces, we propose LoRWeB, a novel approach that specializes the model for each analogy task at inference time through dynamic composition of learned transformation primitives, informally, choosing a point in a "space of LoRAs". We introduce two key components: (1) a learnable basis of LoRA modules, to span the space of different ...