[2603.21155] Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs
About this article
Abstract page for arXiv paper 2603.21155: Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs
Computer Science > Artificial Intelligence arXiv:2603.21155 (cs) [Submitted on 22 Mar 2026] Title:Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs Authors:Zihui Chen, Yuling Wang, Pengfei Jiao, Kai Wu, Xiao Wang, Xiang Ao, Dalin Zhang View a PDF of the paper titled Can LLMs Fool Graph Learning? Exploring Universal Adversarial Attacks on Text-Attributed Graphs, by Zihui Chen and 6 other authors View PDF HTML (experimental) Abstract:Text-attributed graphs (TAGs) enhance graph learning by integrating rich textual semantics and topological context for each node. While boosting expressiveness, they also expose new vulnerabilities in graph learning through text-based adversarial surfaces. Recent advances leverage diverse backbones, such as graph neural networks (GNNs) and pre-trained language models (PLMs), to capture both structural and textual information in TAGs. This diversity raises a key question: How can we design universal adversarial attacks that generalize across architectures to assess the security of TAG models? The challenge arises from the stark contrast in how different backbones-GNNs and PLMs-perceive and encode graph patterns, coupled with the fact that many PLMs are only accessible via APIs, limiting attacks to black-box settings. To address this, we propose BadGraph, a novel attack framework that deeply elicits large language models (LLMs) understanding of general graph knowledge to jointly perturb both node topol...