[2509.23465] ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems
About this article
Abstract page for arXiv paper 2509.23465: ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems
Computer Science > Artificial Intelligence arXiv:2509.23465 (cs) [Submitted on 27 Sep 2025 (v1), last revised 1 Mar 2026 (this version, v2)] Title:ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems Authors:Zhuoli Yin, Yi Ding, Reem Khir, Hua Cai View a PDF of the paper titled ViTSP: A Vision Language Models Guided Framework for Solving Large-Scale Traveling Salesman Problems, by Zhuoli Yin and 3 other authors View PDF Abstract:Solving the Traveling Salesman Problem (TSP) is NP-hard yet fundamental for a wide range of real-world applications. Classical exact methods face challenges in scaling, and heuristic methods often require domain-specific parameter calibration. While learning-based approaches have shown promise, they suffer from poor generalization and limited scalability due to fixed training data. This work proposes ViTSP, a novel framework that leverages pre-trained vision language models (VLMs) to visually guide the solution process for large-scale TSPs. The VLMs function to identify promising small-scale subproblems from a visualized TSP instance, which are then efficiently optimized using an off-the-shelf solver to improve the global solution. ViTSP bypasses the dedicated model training at the user end while maintaining effectiveness across diverse instances. Experiments on real-world TSP instances ranging from 1k to 88k nodes demonstrate that ViTSP consistently achieves solutions with average optimality gaps of ...