[2508.04663] HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models
About this article
Abstract page for arXiv paper 2508.04663: HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models
Computer Science > Computer Vision and Pattern Recognition arXiv:2508.04663 (cs) [Submitted on 6 Aug 2025 (v1), last revised 2 Mar 2026 (this version, v4)] Title:HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models Authors:Young D. Kwon, Rui Li, Sijia Li, Da Li, Sourav Bhattacharya, Stylianos I. Venieris View a PDF of the paper titled HierarchicalPrune: Position-Aware Compression for Large-Scale Diffusion Models, by Young D. Kwon and 5 other authors View PDF HTML (experimental) Abstract:State-of-the-art text-to-image diffusion models (DMs) achieve remarkable quality, yet their massive parameter scale (8-11B) poses significant challenges for inferences on resource-constrained devices. In this paper, we present HierarchicalPrune, a novel compression framework grounded in a key observation: DM blocks exhibit distinct functional hierarchies, where early blocks establish semantic structures while later blocks handle texture refinements. HierarchicalPrune synergistically combines three techniques: (1) Hierarchical Position Pruning, which identifies and removes less essential later blocks based on position hierarchy; (2) Positional Weight Preservation, which systematically protects early model portions that are essential for semantic structural integrity; and (3) Sensitivity-Guided Distillation, which adjusts knowledge-transfer intensity based on our discovery of block-wise sensitivity variations. As a result, our framework brings billion-scale diffusion...