[2603.02702] FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing
About this article
Abstract page for arXiv paper 2603.02702: FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing
Computer Science > Artificial Intelligence arXiv:2603.02702 (cs) [Submitted on 3 Mar 2026] Title:FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing Authors:Jaehoon Lee, Suhwan Park, Tae Yoon Lim, Seunghan Lee, Jun Seo, Dongwan Kang, Hwanil Choi, Minjae Kim, Sungdong Yoo, SoonYoung Lee, Yongjae Lee, Wonbin Ahn View a PDF of the paper titled FinTexTS: Financial Text-Paired Time-Series Dataset via Semantic-Based and Multi-Level Pairing, by Jaehoon Lee and 11 other authors View PDF Abstract:The financial domain involves a variety of important time-series problems. Recently, time-series analysis methods that jointly leverage textual and numerical information have gained increasing attention. Accordingly, numerous efforts have been made to construct text-paired time-series datasets in the financial domain. However, financial markets are characterized by complex interdependencies, in which a company's stock price is influenced not only by company-specific events but also by events in other companies and broader macroeconomic factors. Existing approaches that pair text with financial time-series data based on simple keyword matching often fail to capture such complex relationships. To address this limitation, we propose a semantic-based and multi-level pairing framework. Specifically, we extract company-specific context for the target company from SEC filings and apply an embedding-based matching mechanism to retrieve semantically releva...