[2602.15984] Verifier-Constrained Flow Expansion for Discovery Beyond the Data
Summary
This paper presents a method called Verifier-Constrained Flow Expansion (FE) to enhance flow models for scientific discovery by expanding sample generation beyond the available data distribution while ensuring sample validity.
Why It Matters
The ability to generate valid samples beyond existing data is crucial for scientific discovery, particularly in fields like molecular design. This research addresses a significant limitation of current flow models, potentially leading to more innovative solutions and discoveries in various scientific domains.
Key Takeaways
- Introduces Verifier-Constrained Flow Expansion (FE) to enhance flow models.
- FE allows for sample generation beyond the limitations of existing data.
- The method includes theoretical analysis and convergence guarantees.
- Empirical evaluations demonstrate increased diversity in generated samples.
- Addresses critical challenges in scientific discovery applications.
Computer Science > Machine Learning arXiv:2602.15984 (cs) [Submitted on 17 Feb 2026] Title:Verifier-Constrained Flow Expansion for Discovery Beyond the Data Authors:Riccardo De Santi, Kimon Protopapas, Ya-Ping Hsieh, Andreas Krause View a PDF of the paper titled Verifier-Constrained Flow Expansion for Discovery Beyond the Data, by Riccardo De Santi and 3 other authors View PDF HTML (experimental) Abstract:Flow and diffusion models are typically pre-trained on limited available data (e.g., molecular samples), covering only a fraction of the valid design space (e.g., the full molecular space). As a consequence, they tend to generate samples from only a narrow portion of the feasible domain. This is a fundamental limitation for scientific discovery applications, where one typically aims to sample valid designs beyond the available data distribution. To this end, we address the challenge of leveraging access to a verifier (e.g., an atomic bonds checker), to adapt a pre-trained flow model so that its induced density expands beyond regions of high data availability, while preserving samples validity. We introduce formal notions of strong and weak verifiers and propose algorithmic frameworks for global and local flow expansion via probability-space optimization. Then, we present Flow Expander (FE), a scalable mirror descent scheme that provably tackles both problems by verifier-constrained entropy maximization over the flow process noised state space. Next, we provide a thorough ...