[2602.15830] Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution
Summary
This paper explores the ensemble-size dependence of deep-learning post-processing methods aimed at minimizing unfair scores in ensemble forecasts, presenting a proof-of-concept solution through trajectory transformers.
Why It Matters
The study addresses critical challenges in ensemble forecasting, particularly the bias introduced by ensemble size in post-processing methods. By proposing a novel approach that maintains conditional independence, it enhances the reliability of forecasts, which is vital for applications in atmospheric science and beyond.
Key Takeaways
- Ensemble forecasts can be biased by the size of the training ensemble.
- Traditional methods may violate assumptions of independence, leading to unfair scores.
- Trajectory transformers can achieve ensemble-size independence while improving forecast reliability.
Physics > Atmospheric and Oceanic Physics arXiv:2602.15830 (physics) [Submitted on 17 Feb 2026] Title:Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution Authors:Christopher David Roberts View a PDF of the paper titled Ensemble-size-dependence of deep-learning post-processing methods that minimize an (un)fair score: motivating examples and a proof-of-concept solution, by Christopher David Roberts View PDF HTML (experimental) Abstract:Fair scores reward ensemble forecast members that behave like samples from the same distribution as the verifying observations. They are therefore an attractive choice as loss functions to train data-driven ensemble forecasts or post-processing methods when large training ensembles are either unavailable or computationally prohibitive. The adjusted continuous ranked probability score (aCRPS) is fair and unbiased with respect to ensemble size, provided forecast members are exchangeable and interpretable as conditionally independent draws from an underlying predictive distribution. However, distribution-aware post-processing methods that introduce structural dependency between members can violate this assumption, rendering aCRPS unfair. We demonstrate this effect using two approaches designed to minimize the expected aCRPS of a finite ensemble: (1) a linear member-by-member calibration, which couples members through a common dependency on the sampl...