[2602.19685] PerturbDiff: Functional Diffusion for Single-Cell Perturbation Modeling
Summary
PerturbDiff introduces a novel approach to modeling single-cell responses to perturbations by utilizing a diffusion-based generative process over probability distributions, enhancing prediction accuracy.
Why It Matters
This research addresses a significant challenge in systems biology by improving the predictive modeling of cellular responses to perturbations, which is crucial for understanding complex biological systems and developing targeted therapies. The methodology could lead to advancements in personalized medicine and drug development.
Key Takeaways
- PerturbDiff models cellular responses by focusing on entire distributions rather than individual cells.
- The approach captures variability in responses due to unobservable factors, enhancing prediction accuracy.
- Benchmarks demonstrate that PerturbDiff achieves state-of-the-art performance in single-cell response prediction.
Computer Science > Machine Learning arXiv:2602.19685 (cs) [Submitted on 23 Feb 2026] Title:PerturbDiff: Functional Diffusion for Single-Cell Perturbation Modeling Authors:Xinyu Yuan, Xixian Liu, Ya Shi Zhang, Zuobai Zhang, Hongyu Guo, Jian Tang View a PDF of the paper titled PerturbDiff: Functional Diffusion for Single-Cell Perturbation Modeling, by Xinyu Yuan and 5 other authors View PDF HTML (experimental) Abstract:Building Virtual Cells that can accurately simulate cellular responses to perturbations is a long-standing goal in systems biology. A fundamental challenge is that high-throughput single-cell sequencing is destructive: the same cell cannot be observed both before and after a perturbation. Thus, perturbation prediction requires mapping unpaired control and perturbed populations. Existing models address this by learning maps between distributions, but typically assume a single fixed response distribution when conditioned on observed cellular context (e.g., cell type) and the perturbation type. In reality, responses vary systematically due to unobservable latent factors such as microenvironmental fluctuations and complex batch effects, forming a manifold of possible distributions for the same observed conditions. To account for this variability, we introduce PerturbDiff, which shifts modeling from individual cells to entire distributions. By embedding distributions as points in a Hilbert space, we define a diffusion-based generative process operating directly ove...