[2602.16944] Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming
Summary
This paper presents a framework for certifying data-poisoning attacks in neural networks using mixed-integer programming, ensuring robust training against adversarial data manipulation.
Why It Matters
As machine learning systems become increasingly prevalent, understanding and mitigating data-poisoning attacks is crucial for maintaining model integrity and security. This research offers a novel approach to certifying robustness, which can enhance trust in AI applications.
Key Takeaways
- Introduces a verification framework for data-poisoning attacks.
- Utilizes mixed-integer quadratic programming for optimal attack certification.
- Provides sound and complete guarantees for training-time robustness.
- Experimental results confirm the framework's effectiveness on small models.
- Addresses a critical area in AI safety and model evaluation.
Computer Science > Machine Learning arXiv:2602.16944 (cs) [Submitted on 18 Feb 2026] Title:Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming Authors:Philip Sosnin, Jodie Knapp, Fraser Kennedy, Josh Collyer, Calvin Tsay View a PDF of the paper titled Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming, by Philip Sosnin and 4 other authors View PDF HTML (experimental) Abstract:This work introduces a verification framework that provides both sound and complete guarantees for data poisoning attacks during neural network training. We formulate adversarial data manipulation, model training, and test-time evaluation in a single mixed-integer quadratic programming (MIQCP) problem. Finding the global optimum of the proposed formulation provably yields worst-case poisoning attacks, while simultaneously bounding the effectiveness of all possible attacks on the given training pipeline. Our framework encodes both the gradient-based training dynamics and model evaluation at test time, enabling the first exact certification of training-time robustness. Experimental evaluation on small models confirms that our approach delivers a complete characterization of robustness against data poisoning. Comments: Subjects: Machine Learning (cs.LG) Cite as: arXiv:2602.16944 [cs.LG] (or arXiv:2602.16944v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.16944 Focus to learn more arXiv-issued DOI via DataCite (pending registration) ...