[2602.17330] SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework
Summary
The paper presents SubQuad, an innovative pipeline for analyzing adaptive immune repertoires, addressing challenges of high computational costs and dataset imbalances through advanced machine learning techniques.
Why It Matters
SubQuad's approach significantly enhances the efficiency and fairness of immune repertoire analysis, which is crucial for vaccine development and biomarker discovery. By reducing computational costs and ensuring equitable representation of minority clonotypes, it paves the way for more effective clinical applications in immunology.
Key Takeaways
- SubQuad reduces pairwise affinity evaluation costs from near-quadratic to near-subquadratic.
- It employs a fairness-constrained clustering method to ensure representation of rare antigen-specific groups.
- The system integrates GPU-accelerated affinity kernels for improved performance.
- SubQuad enhances throughput and memory efficiency while maintaining high recall and cluster purity.
- The framework is applicable for downstream tasks like vaccine target prioritization and biomarker discovery.
Computer Science > Machine Learning arXiv:2602.17330 (cs) [Submitted on 19 Feb 2026] Title:SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework Authors:Rong Fu, Zijian Zhang, Wenxin Zhang, Kun Liu, Jiekai Wu, Xianda Li, Simon Fong View a PDF of the paper titled SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework, by Rong Fu and 6 other authors View PDF HTML (experimental) Abstract:Comparative analysis of adaptive immune repertoires at population scale is hampered by two practical bottlenecks: the near-quadratic cost of pairwise affinity evaluations and dataset imbalances that obscure clinically important minority clonotypes. We introduce SubQuad, an end-to-end pipeline that addresses these challenges by combining antigen-aware, near-subquadratic retrieval with GPU-accelerated affinity kernels, learned multimodal fusion, and fairness-constrained clustering. The system employs compact MinHash prefiltering to sharply reduce candidate comparisons, a differentiable gating module that adaptively weights complementary alignment and embedding channels on a per-pair basis, and an automated calibration routine that enforces proportional representation of rare antigen-specific subgroups. On large viral and tumor repertoires SubQuad achieves measured gains in throughput and peak memory usage while preserving or improving recall@k, cluster purity, and subgr...