[2502.04591] Are We Measuring Oversmoothing in Graph Neural Networks Correctly?
Summary
This article critiques traditional metrics for measuring oversmoothing in Graph Neural Networks (GNNs) and proposes a rank-based approach as a more effective alternative.
Why It Matters
Understanding oversmoothing is crucial for improving GNN performance. This research highlights the limitations of existing metrics and offers a new method that aligns better with real-world scenarios, potentially guiding future GNN architecture design and evaluation.
Key Takeaways
- Traditional metrics for oversmoothing in GNNs are limited and often misleading.
- The proposed rank-based metrics provide a more reliable assessment of oversmoothing.
- Performance degradation in GNNs can occur with fewer layers than previously thought.
- Numerical rank closely correlates with model performance, offering a new evaluation perspective.
- Theoretical insights support the effectiveness of rank-based approaches over energy-based metrics.
Computer Science > Machine Learning arXiv:2502.04591 (cs) [Submitted on 7 Feb 2025 (v1), last revised 21 Feb 2026 (this version, v4)] Title:Are We Measuring Oversmoothing in Graph Neural Networks Correctly? Authors:Kaicheng Zhang, Piero Deidda, Desmond Higham, Francesco Tudisco View a PDF of the paper titled Are We Measuring Oversmoothing in Graph Neural Networks Correctly?, by Kaicheng Zhang and 3 other authors View PDF HTML (experimental) Abstract:Oversmoothing is a fundamental challenge in graph neural networks (GNNs): as the number of layers increases, node embeddings become increasingly similar, and model performance drops sharply. Traditionally, oversmoothing has been quantified using metrics that measure the similarity of neighbouring node features, such as the Dirichlet energy. We argue that these metrics have critical limitations and fail to reliably capture oversmoothing in realistic scenarios. For instance, they provide meaningful insights only for very deep networks, while typical GNNs show a performance drop already with as few as 10 layers. As an alternative, we propose measuring oversmoothing by examining the numerical or effective rank of the feature representations. We provide extensive numerical evaluation across diverse graph architectures and datasets to show that rank-based metrics consistently capture oversmoothing, whereas energy-based metrics often fail. Notably, we reveal that drops in the rank align closely with performance degradation, even in sc...