[2602.22760] Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study
Summary
This study explores the feasibility of pretraining large language models (LLMs) during renewable energy curtailment periods, aiming to reduce operational emissions and utilize excess clean energy.
Why It Matters
As the demand for energy-efficient AI training grows, aligning LLM training with renewable energy curtailment windows presents a sustainable solution. This research highlights innovative approaches to reduce carbon footprints in AI development, which is crucial for addressing climate change and energy waste.
Key Takeaways
- Training LLMs during renewable energy curtailment can significantly lower operational emissions.
- The proposed system uses geo-distributed GPU clusters to optimize training schedules based on energy availability.
- Preliminary results indicate that curtailment-aware scheduling maintains training quality while reducing emissions to 5-12% of traditional methods.
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2602.22760 (cs) [Submitted on 26 Feb 2026] Title:Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study Authors:Philipp Wiesner, Soeren Becker, Brett Cornick, Dominik Scheinert, Alexander Acker, Odej Kao View a PDF of the paper titled Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study, by Philipp Wiesner and 5 other authors View PDF HTML (experimental) Abstract:Training large language models (LLMs) requires substantial compute and energy. At the same time, renewable energy sources regularly produce more electricity than the grid can absorb, leading to curtailment, the deliberate reduction of clean generation that would otherwise go to waste. These periods represent an opportunity: if training is aligned with curtailment windows, LLMs can be pretrained using electricity that is both clean and cheap. This technical report presents a system that performs full-parameter LLM training across geo-distributed GPU clusters during regional curtailment windows, elastically switching between local single-site training and federated multi-site synchronization as sites become available or unavailable. Our prototype trains a 561M-parameter transformer model across three clusters using the Flower federated learning framework, with curtailment periods derived from real-world marginal carbon intensity traces. Preliminary results show that curtailment-aware ...