[2602.21569] How many asymmetric communities are there in multi-layer directed networks?
Summary
This paper explores the estimation of asymmetric community numbers in multi-layer directed networks, introducing a novel goodness-of-fit test and sequential testing procedure to identify sender and receiver community structures.
Why It Matters
Understanding asymmetric communities in multi-layer directed networks is crucial for various applications in network analysis, social sciences, and data science. This research provides a robust statistical framework that enhances the accuracy of community detection, which can lead to better insights in complex systems.
Key Takeaways
- Introduces a novel goodness-of-fit test for community detection.
- Proposes a sequential testing procedure for estimating community numbers.
- Demonstrates the effectiveness of the method under the multi-layer stochastic co-block model.
Mathematics > Statistics Theory arXiv:2602.21569 (math) [Submitted on 25 Feb 2026] Title:How many asymmetric communities are there in multi-layer directed networks? Authors:Huan Qing View a PDF of the paper titled How many asymmetric communities are there in multi-layer directed networks?, by Huan Qing View PDF HTML (experimental) Abstract:Estimating the asymmetric numbers of communities in multi-layer directed networks is a challenging problem due to the multi-layer structures and inherent directional asymmetry, leading to possibly different numbers of sender and receiver communities. This work addresses this issue under the multi-layer stochastic co-block model, a model for multi-layer directed networks with distinct community structures in sending and receiving sides, by proposing a novel goodness-of-fit test. The test statistic relies on the deviation of the largest singular value of an aggregated normalized residual matrix from the constant 2. The test statistic exhibits a sharp dichotomy: Under the null hypothesis of correct model specification, its upper bound converges to zero with high probability; under underfitting, the test statistic itself diverges to infinity. With this property, we develop a sequential testing procedure that searches through candidate pairs of sender and receiver community numbers in a lexicographic order. The process stops at the smallest such pair where the test statistic drops below a decaying threshold. For robustness, we also propose a ...