[2602.02137] DCoPilot: Generative AI-Empowered Policy Adaptation for Dynamic Data Center Operations
Summary
DCoPilot is a hybrid framework utilizing generative AI to enhance policy adaptation in dynamic data center operations, ensuring efficient and safe management of AI workloads.
Why It Matters
As data centers increasingly host AI-driven workloads, the need for rapid adaptation to changing conditions becomes critical. DCoPilot addresses the challenges of manual policy design, providing a solution that enhances operational efficiency and minimizes service disruptions. This innovation is significant for organizations relying on data centers to maintain service-level agreements and optimize energy usage.
Key Takeaways
- DCoPilot integrates large language models and hypernetworks for policy generation.
- The framework operates through simulation, meta policy distillation, and online adaptation.
- DCoPilot achieves near-zero constraint violations across diverse control tasks.
- Ablation studies confirm the effectiveness of LLM-based reward generation.
- This approach significantly outperforms traditional methods in dynamic environments.
Computer Science > Machine Learning arXiv:2602.02137 (cs) [Submitted on 2 Feb 2026 (v1), last revised 25 Feb 2026 (this version, v3)] Title:DCoPilot: Generative AI-Empowered Policy Adaptation for Dynamic Data Center Operations Authors:Minghao Li, Ruihang Wang, Rui Tan, Yonggang Wen View a PDF of the paper titled DCoPilot: Generative AI-Empowered Policy Adaptation for Dynamic Data Center Operations, by Minghao Li and 3 other authors View PDF HTML (experimental) Abstract:Modern data centers (DCs) hosting artificial intelligence (AI)-dedicated devices operate at high power densities with rapidly varying workloads, making minute-level adaptation essential for safe and energy-efficient operation. However, manually designing piecewise deep reinforcement learning (DRL) agents cannot keep pace with frequent dynamics shifts and service-level agreement (SLA) changes of an evolving DC. This specification-to-policy lag causes a lack of timely, effective control policies, which may lead to service outages. To bridge the gap, we present DCoPilot, a hybrid framework for generative control policies in dynamic DC operation. DCoPilot synergizes two distinct generative paradigms, i.e., a large language model (LLM) that performs symbolic generation of structured reward forms, and a hypernetwork that conducts parametric generation of policy weights. DCoPilot operates through three coordinated phases: (i) simulation scale-up, which stress-tests reward candidates across diverse simulation-ready ...