[2602.22387] Disentangling Shared and Target-Enriched Topics via Background-Contrastive Non-negative Matrix Factorization
Summary
This article introduces a novel method called background contrastive Non-negative Matrix Factorization, aimed at isolating biological signals in high-dimensional data by addressing confounding variations.
Why It Matters
The ability to disentangle shared and target-enriched topics in biological datasets is crucial for accurate data interpretation and analysis. This method enhances existing approaches by providing scalable and interpretable results, which can lead to significant advancements in biological research and understanding of complex diseases.
Key Takeaways
- Introduces a new method for extracting target-enriched latent topics from high-dimensional biological data.
- Addresses limitations of existing background correction methods that are either unscalable or hard to interpret.
- Demonstrates effectiveness across various biological datasets, revealing previously obscured signals.
- Utilizes a contrastive objective to suppress background-expressed structures, enhancing interpretability.
- Scalable to big data through efficient GPU-based training methods.
Computer Science > Machine Learning arXiv:2602.22387 (cs) [Submitted on 25 Feb 2026] Title:Disentangling Shared and Target-Enriched Topics via Background-Contrastive Non-negative Matrix Factorization Authors:Yixuan Li, Archer Y. Yang, Yue Li View a PDF of the paper titled Disentangling Shared and Target-Enriched Topics via Background-Contrastive Non-negative Matrix Factorization, by Yixuan Li and 2 other authors View PDF HTML (experimental) Abstract:Biological signals of interest in high-dimensional data are often masked by dominant variation shared across conditions. This variation, arising from baseline biological structure or technical effects, can prevent standard dimensionality reduction methods from resolving condition-specific structure. The challenge is that these confounding topics are often unknown and mixed with biological signals. Existing background correction methods are either unscalable to high dimensions or not interpretable. We introduce background contrastive Non-negative Matrix Factorization (\model), which extracts target-enriched latent topics by jointly factorizing a target dataset and a matched background using shared non-negative bases under a contrastive objective that suppresses background-expressed structure. This approach yields non-negative components that are directly interpretable at the feature level, and explicitly isolates target-specific variation. \model is learned by an efficient multiplicative update algorithm via matrix multiplicatio...