[2410.16882] SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

[2410.16882] SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

arXiv - Machine Learning 4 min read Article

Summary

The paper presents SaVe-TAG, a novel framework that utilizes Large Language Models for semantic-aware interpolation in long-tailed text-attributed graphs, enhancing classification performance in imbalanced datasets.

Why It Matters

This research addresses the challenges of class imbalance in graph neural networks, particularly in text-attributed graphs. By integrating semantic understanding through LLMs, it offers a more effective method for generating synthetic samples, which is crucial for improving model generalization and performance in real-world applications.

Key Takeaways

  • SaVe-TAG leverages LLMs for semantic-aware interpolation in graph data.
  • The method addresses class imbalance in long-tailed distributions effectively.
  • A confidence-based edge assignment mechanism ensures structural consistency.
  • Extensive experiments demonstrate superior performance over existing methods.
  • The approach highlights the importance of combining semantic and structural signals.

Computer Science > Artificial Intelligence arXiv:2410.16882 (cs) [Submitted on 22 Oct 2024 (v1), last revised 13 Feb 2026 (this version, v5)] Title:SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs Authors:Leyao Wang, Yu Wang, Bo Ni, Yuying Zhao, Hanyu Wang, Yao Ma, Tyler Derr View a PDF of the paper titled SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs, by Leyao Wang and 6 other authors View PDF HTML (experimental) Abstract:Real-world graph data often follows long-tailed distributions, making it difficult for Graph Neural Networks (GNNs) to generalize well across both head and tail classes. Recent advances in Vicinal Risk Minimization (VRM) have shown promise in mitigating class imbalance with numeric interpolation; however, existing approaches largely rely on embedding-space arithmetic, which fails to capture the rich semantics inherent in text-attributed graphs. In this work, we propose our method, SaVe-TAG (Semantic-aware Vicinal Risk Minimization for Long-Tailed Text-Attributed Graphs), a novel VRM framework that leverages Large Language Models (LLMs) to perform text-level interpolation, generating on-manifold, boundary-enriching synthetic samples for minority classes. To mitigate the risk of noisy generation, we introduce a confidence-based edge assignment mechanism that uses graph topology as a natural filter to ensure structural consistency. We provide theoretical justification for our method and conduct extensive ex...

Related Articles

[2603.29171] Segmentation of Gray Matters and White Matters from Brain MRI data
Llms

[2603.29171] Segmentation of Gray Matters and White Matters from Brain MRI data

Abstract page for arXiv paper 2603.29171: Segmentation of Gray Matters and White Matters from Brain MRI data

arXiv - Machine Learning · 4 min ·
[2602.09924] LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations
Llms

[2602.09924] LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

Abstract page for arXiv paper 2602.09924: LLMs Encode Their Failures: Predicting Success from Pre-Generation Activations

arXiv - Machine Learning · 3 min ·
[2602.01528] Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning
Llms

[2602.01528] Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning

Abstract page for arXiv paper 2602.01528: Making Bias Non-Predictive: Training Robust LLM Reasoning via Reinforcement Learning

arXiv - Machine Learning · 4 min ·
[2601.22783] Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval
Llms

[2601.22783] Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval

Abstract page for arXiv paper 2601.22783: Compact Hypercube Embeddings for Fast Text-based Wildlife Observation Retrieval

arXiv - Machine Learning · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime