[2509.21725] Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems
Summary
This paper presents an information-theoretic approach to Bayesian optimization for bilevel optimization problems, addressing the complexities of nested optimization tasks.
Why It Matters
Bilevel optimization is crucial in various fields, including machine learning and operations research, where decisions at one level depend on outcomes from another. This research introduces a novel method that enhances optimization efficiency, which could lead to better solutions in complex scenarios.
Key Takeaways
- Introduces an information-theoretic framework for bilevel optimization.
- Proposes a unified criterion for evaluating optimization benefits at both levels.
- Demonstrates effectiveness through empirical tests on benchmark datasets.
Computer Science > Machine Learning arXiv:2509.21725 (cs) [Submitted on 26 Sep 2025 (v1), last revised 26 Feb 2026 (this version, v2)] Title:Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems Authors:Takuya Kanayama, Yuki Ito, Tomoyuki Tamura, Masayuki Karasuyama View a PDF of the paper titled Information-Theoretic Bayesian Optimization for Bilevel Optimization Problems, by Takuya Kanayama and 3 other authors View PDF HTML (experimental) Abstract:A bilevel optimization problem consists of two optimization problems nested as an upper- and a lower-level problem, in which the optimality of the lower-level problem defines a constraint for the upper-level problem. This paper considers Bayesian optimization (BO) for the case that both the upper- and lower-levels involve expensive black-box functions. Because of its nested structure, bilevel optimization has a complex problem definition, by which bilevel BO has not been widely studied compared with other standard extensions of BO such as multi-objective or constraint problems. We propose an information-theoretic approach that considers the information gain of both the upper- and lower-optimal solutions and values. This enables us to define a unified criterion that measures the benefit for both level problems, simultaneously. Further, we also show a practical lower bound based approach to evaluating the information gain. We empirically demonstrate the effectiveness of our proposed method through several ...