[2510.03255] SciTS: Scientific Time Series Understanding and Generation with LLMs
Summary
The paper introduces SciTS, a benchmark for understanding and generating scientific time series data using large language models (LLMs), addressing gaps in current methodologies.
Why It Matters
As scientific data increasingly relies on time series analysis, this research highlights the limitations of existing models and proposes a new framework, TimeOmni, that enhances LLM capabilities in handling complex temporal data. This advancement is crucial for improving scientific reasoning and analysis across various domains.
Key Takeaways
- SciTS benchmark spans 12 scientific domains with over 50k instances.
- General-purpose LLMs outperform specialized time series models in generalizability.
- Representing time series as text or images can hinder performance due to sequence length and precision loss.
- TimeOmni framework enhances LLMs' ability to understand and generate time series data.
- Research fills a significant gap in benchmarks and modeling frameworks for scientific time series.
Computer Science > Machine Learning arXiv:2510.03255 (cs) [Submitted on 26 Sep 2025 (v1), last revised 25 Feb 2026 (this version, v2)] Title:SciTS: Scientific Time Series Understanding and Generation with LLMs Authors:Wen Wu, Ziyang Zhang, Liwei Liu, Xuenan Xu, Jimin Zhuang, Ke Fan, Qitan Lv, Junlin Liu, Chen Zhang, Zheqi Yuan, Siyuan Hou, Tianyi Lin, Kai Chen, Bowen Zhou, Chao Zhang View a PDF of the paper titled SciTS: Scientific Time Series Understanding and Generation with LLMs, by Wen Wu and 14 other authors View PDF HTML (experimental) Abstract:The scientific reasoning ability of large language models (LLMs) has recently attracted significant attention. Time series, as a fundamental modality in scientific data, presents unique challenges that are often overlooked in current multimodal LLMs, which either encode numerical sequences as text or convert them into images. Such approaches may be insufficient for comprehensive scientific time series understanding and generation. Existing unified time series models typically specialise in either forecasting or analysis, and their effectiveness on non-periodic, heterogeneous scientific signals remains unclear. To address these gaps, we introduce SciTS, a benchmark spanning 12 scientific domains and 43 tasks, with over 50k+ instances, both univariate and multivariate signals ranging from $10^0$ to $10^7$ in length and up to 10~MHz in frequency. We benchmark 17 models, including text-only LLMs, multimodal LLMs, and unified time ...