[2602.15350] Fine-Tuning LLMs to Generate Economical and Reliable Actions for the Power Grid
Summary
This paper discusses a method for fine-tuning large language models (LLMs) to generate effective corrective actions for power grid management during Public Safety Power Shutoffs (PSPS).
Why It Matters
The research addresses critical challenges in power grid management, particularly during emergencies that require rapid adjustments. By improving the reliability and efficiency of corrective actions through LLMs, this work could enhance grid stability and safety, which is vital for public welfare and energy management.
Key Takeaways
- Fine-tuning LLMs can significantly improve corrective actions in power grid scenarios.
- The proposed method reduces AC power-flow failures from 50% to single digits.
- The approach incorporates voltage-awareness into decision-making processes.
- A reproducible framework is provided, enhancing the study's credibility.
- The research highlights the importance of AI in managing critical infrastructure.
Electrical Engineering and Systems Science > Systems and Control arXiv:2602.15350 (eess) [Submitted on 17 Feb 2026] Title:Fine-Tuning LLMs to Generate Economical and Reliable Actions for the Power Grid Authors:Mohamad Chehade, Hao Zhu View a PDF of the paper titled Fine-Tuning LLMs to Generate Economical and Reliable Actions for the Power Grid, by Mohamad Chehade and Hao Zhu View PDF HTML (experimental) Abstract:Public Safety Power Shutoffs (PSPS) force rapid topology changes that can render standard operating points infeasible, requiring operators to quickly identify corrective transmission switching actions that reduce load shedding while maintaining acceptable voltage behavior. We present a verifiable, multi-stage adaptation pipeline that fine-tunes an instruction-tuned large language model (LLM) to generate \emph{open-only} corrective switching plans from compact PSPS scenario summaries under an explicit switching budget. First, supervised fine-tuning distills a DC-OPF MILP oracle into a constrained action grammar that enables reliable parsing and feasibility checks. Second, direct preference optimization refines the policy using AC-evaluated preference pairs ranked by a voltage-penalty metric, injecting voltage-awareness beyond DC imitation. Finally, best-of-$N$ selection provides an inference-time addition by choosing the best feasible candidate under the target metric. On IEEE 118-bus PSPS scenarios, fine-tuning substantially improves DC objective values versus zero...