[2411.03941] Modular Deep Learning for Multivariate Time-Series: Decoupling Imputation and Downstream Tasks
Summary
This paper proposes a modular approach to deep learning for multivariate time-series data, separating imputation from downstream tasks to enhance model reusability and adaptability.
Why It Matters
The prevalence of missing values in time-series data complicates analysis and decision-making. This research addresses these challenges by advocating for a modular framework, which can improve model performance and flexibility, making it more applicable in real-world scenarios.
Key Takeaways
- Decoupling imputation from predictive tasks enhances model flexibility.
- A modular approach allows for independent optimization of components.
- The proposed method maintains high performance across various datasets.
- Utilizes the PyPOTS library for deep learning-based time-series analysis.
- Modularity can significantly improve the interpretability and reusability of models.
Computer Science > Machine Learning arXiv:2411.03941 (cs) [Submitted on 6 Nov 2024 (v1), last revised 25 Feb 2026 (this version, v3)] Title:Modular Deep Learning for Multivariate Time-Series: Decoupling Imputation and Downstream Tasks Authors:Joseph Arul Raj, Linglong Qian, Zina Ibrahim View a PDF of the paper titled Modular Deep Learning for Multivariate Time-Series: Decoupling Imputation and Downstream Tasks, by Joseph Arul Raj and 1 other authors View PDF HTML (experimental) Abstract:Missing values are pervasive in large-scale time-series data, posing challenges for reliable analysis and decision-making. Many neural architectures have been designed to model and impute the complex and heterogeneous missingness patterns of such data. Most existing methods are end-to-end, rendering imputation tightly coupled with downstream predictive tasks and leading to limited reusability of the trained model, reduced interpretability, and challenges in assessing model quality. In this paper, we call for a modular approach that decouples imputation and downstream tasks, enabling independent optimisation and greater adaptability. Using the largest open-source Python library for deep learning-based time-series analysis, PyPOTS, we evaluate a modular pipeline across six state-of-the-art models that perform imputation and prediction on seven datasets spanning multiple domains. Our results show that a modular approach maintains high performance while prioritising flexibility and reusability ...