[2602.17092] A Locality Radius Framework for Understanding Relational Inductive Bias in Database Learning
Summary
This paper introduces a locality radius framework to understand relational inductive bias in database learning, focusing on the necessary structural neighborhood for effective predictions.
Why It Matters
Understanding the locality radius is crucial for optimizing machine learning models that rely on relational data. This research addresses the gap in knowledge regarding when complex reasoning is necessary, potentially improving model performance in database tasks.
Key Takeaways
- Introduces locality radius as a measure for structural neighborhood in relational tasks.
- Hypothesizes that model performance is linked to locality radius and aggregation depth.
- Presents empirical studies across various database learning tasks to validate the hypothesis.
Computer Science > Machine Learning arXiv:2602.17092 (cs) [Submitted on 19 Feb 2026] Title:A Locality Radius Framework for Understanding Relational Inductive Bias in Database Learning Authors:Aadi Joshi, Kavya Bhand View a PDF of the paper titled A Locality Radius Framework for Understanding Relational Inductive Bias in Database Learning, by Aadi Joshi and Kavya Bhand View PDF Abstract:Foreign key discovery and related schema-level prediction tasks are often modeled using graph neural networks (GNNs), implicitly assuming that relational inductive bias improves performance. However, it remains unclear when multi-hop structural reasoning is actually necessary. In this work, we introduce locality radius, a formal measure of the minimum structural neighborhood required to determine a prediction in relational schemas. We hypothesize that model performance depends critically on alignment between task locality radius and architectural aggregation depth. We conduct a controlled empirical study across foreign key prediction, join cost estimation, blast radius regression, cascade impact classification, and additional graph-derived schema tasks. Our evaluation includes multi-seed experiments, capacity-matched comparisons, statistical significance testing, scaling analysis, and synthetic radius-controlled benchmarks. Results reveal a consistent bias-radius alignment effect. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2602.17092 [cs.LG] (or arXiv:2602.17092v1 [cs.LG] for this v...