[2407.17412] (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork
Summary
The paper presents PASS, a novel algorithmic framework that utilizes visual prompts to enhance structural sparsity in neural networks, improving efficiency and accuracy across various datasets.
Why It Matters
As large-scale neural networks become increasingly resource-intensive, efficient model pruning techniques are essential. This research explores innovative ways to optimize model performance while reducing computational costs, which is crucial for advancing AI applications in real-world scenarios.
Key Takeaways
- PASS leverages visual prompts to determine channel importance in neural networks.
- The framework achieves better accuracy and speedup compared to baseline models.
- Comprehensive experiments validate the effectiveness of PASS across multiple architectures and datasets.
Computer Science > Computer Vision and Pattern Recognition arXiv:2407.17412 (cs) [Submitted on 24 Jul 2024 (v1), last revised 21 Feb 2026 (this version, v2)] Title:(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork Authors:Tianjin Huang, Fang Meng, Li Shen, Fan Liu, Yulong Pei, Mykola Pechenizkiy, Shiwei Liu, Tianlong Chen View a PDF of the paper titled (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork, by Tianjin Huang and 7 other authors View PDF HTML (experimental) Abstract:Large-scale neural networks have demonstrated remarkable performance in different domains like vision and language processing, although at the cost of massive computation resources. As illustrated by compression literature, structural model pruning is a prominent algorithm to encourage model efficiency, thanks to its acceleration-friendly sparsity patterns. One of the key questions of structural pruning is how to estimate the channel significance. In parallel, work on data-centric AI has shown that prompting-based techniques enable impressive generalization of large language models across diverse downstream tasks. In this paper, we investigate a charming possibility - \textit{leveraging visual prompts to capture the channel importance and derive high-quality structural sparsity}. To this end, we propose a novel algorithmic framework, namely \texttt{PASS}. It is a tailored hyper-network to take both visual prompts and network weigh...