[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users

[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

arXiv - AI April 29, 2026 3 min read

About this article

Abstract page for arXiv paper 2602.07096: RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

Quantitative Finance > Statistical Finance arXiv:2602.07096 (q-fin) [Submitted on 6 Feb 2026 (v1), last revised 26 Apr 2026 (this version, v2)] Title:RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid? Authors:Yuyang Dai, Yan Lin, Zhuohan Xie, Yuxia Wang View a PDF of the paper titled RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?, by Yuyang Dai and 3 other authors View PDF HTML (experimental) Abstract:Reliable financial reasoning requires knowing not only how to answer, but also when an answer cannot be justified. In real financial practice, problems often rely on implicit assumptions that are taken for granted rather than stated explicitly, causing problems to appear solvable while lacking enough information for a definite answer. We introduce REALFIN, a bilingual benchmark that evaluates financial reasoning by systematically removing essential premises from exam-style questions while keeping them linguistically plausible. Based on this, we evaluate models under three formulations that test answering, recognizing missing information, and rejecting unjustified options, and find consistent performance drops when key conditions are absent. General-purpose models tend to over-commit and guess, while most finance-specialized models fail to clearly identify missing premises. These results highlight a critical gap in current evaluations and show that reliable financial models must know when a question should not be answ...

Originally published on April 29, 2026. Curated by AI News.

Llms

[2604.16909] PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

Abstract page for arXiv paper 2604.16909: PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

arXiv - AI · 4 min · about 2 hours ago

Llms

[2604.07802] Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

Abstract page for arXiv paper 2604.07802: Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

arXiv - AI · 4 min · about 2 hours ago

Llms

[2602.07605] Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

Abstract page for arXiv paper 2602.07605: Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Rea...

arXiv - AI · 4 min · about 2 hours ago

Llms

[2601.22246] MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

Abstract page for arXiv paper 2601.22246: MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

arXiv - AI · 3 min · about 2 hours ago

[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

About this article

Related Articles

[2604.16909] PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

[2604.07802] Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

[2602.07605] Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

[2601.22246] MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

No comments

Stay updated with AI News