[2602.17634] Reverso: Efficient Time Series Foundation Models for Zero-shot Forecasting
Summary
The paper presents Reverso, an efficient time series foundation model for zero-shot forecasting, demonstrating that smaller hybrid models can outperform larger transformers.
Why It Matters
As the demand for efficient machine learning models grows, this research offers a significant advancement in time series forecasting. By reducing model size while maintaining performance, it addresses cost and resource efficiency, making it more accessible for practical applications across various industries.
Key Takeaways
- Reverso models are significantly smaller than traditional transformer models.
- Hybrid architectures combining convolution and RNN layers can achieve comparable performance.
- The research introduces effective data augmentation and inference strategies to enhance model efficiency.
- This approach can reduce operational costs and resource usage in time series forecasting.
- The findings push the performance-efficiency boundary in machine learning models.
Computer Science > Machine Learning arXiv:2602.17634 (cs) [Submitted on 19 Feb 2026] Title:Reverso: Efficient Time Series Foundation Models for Zero-shot Forecasting Authors:Xinghong Fu, Yanhong Li, Georgios Papaioannou, Yoon Kim View a PDF of the paper titled Reverso: Efficient Time Series Foundation Models for Zero-shot Forecasting, by Xinghong Fu and 3 other authors View PDF HTML (experimental) Abstract:Learning time series foundation models has been shown to be a promising approach for zero-shot time series forecasting across diverse time series domains. Insofar as scaling has been a critical driver of performance of foundation models in other modalities such as language and vision, much recent work on time series foundation modeling has focused on scaling. This has resulted in time series foundation models with hundreds of millions of parameters that are, while performant, inefficient and expensive to use in practice. This paper describes a simple recipe for learning efficient foundation models for zero-shot time series forecasting that are orders of magnitude smaller. We show that large-scale transformers are not necessary: small hybrid models that interleave long convolution and linear RNN layers (in particular DeltaNet layers) can match the performance of larger transformer-based models while being more than a hundred times smaller. We also describe several data augmentation and inference strategies that further improve performance. This recipe results in Reverso, ...