[2505.20650] FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information
Summary
The paper introduces FinTagging, a benchmark for evaluating LLMs in extracting and structuring financial information, addressing limitations of existing benchmarks.
Why It Matters
Accurate financial data interpretation is crucial for markets and regulators. FinTagging provides a comprehensive framework for assessing LLMs' capabilities, enhancing the understanding of their performance in real-world financial contexts.
Key Takeaways
- FinTagging is the first benchmark for comprehensive XBRL tagging.
- It decomposes the tagging process into Financial Numeric Identification and Financial Concept Linking.
- LLMs perform well in entity extraction but struggle with fine-grained concept linking.
- The benchmark addresses the shortcomings of existing models in hierarchical taxonomy understanding.
- This research highlights the need for improved domain-specific reasoning in LLMs.
Computer Science > Computation and Language arXiv:2505.20650 (cs) [Submitted on 27 May 2025 (v1), last revised 19 Feb 2026 (this version, v4)] Title:FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information Authors:Yan Wang, Lingfei Qian, Xueqing Peng, Yang Ren, Keyi Wang, Yi Han, Dongji Feng, Fengran Mo, Shengyuan Lin, Qinchuan Zhang, Kaiwen He, Chenri Luo, Jianxing Chen, Junwei Wu, Chen Xu, Ziyang Xu, Jimin Huang, Guojun Xiong, Xiao-Yang Liu, Qianqian Xie, Jian-Yun Nie View a PDF of the paper titled FinTagging: Benchmarking LLMs for Extracting and Structuring Financial Information, by Yan Wang and 20 other authors View PDF HTML (experimental) Abstract:Accurate interpretation of numerical data in financial reports is critical for markets and regulators. Although XBRL (eXtensible Business Reporting Language) provides a standard for tagging financial figures, mapping thousands of facts to over 10k US GAAP concepts remains costly and error prone. Existing benchmarks oversimplify this task as flat, single step classification over small subsets of concepts, ignoring the hierarchical semantics of the taxonomy and the structured nature of financial documents. Consequently, these benchmarks fail to evaluate Large Language Models (LLMs) under realistic reporting conditions. To bridge this gap, we introduce FinTagging, the first comprehensive benchmark for structure aware and full scope XBRL tagging. We decompose the complex tagging process into two subtask...