[2602.16990] Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation

[2602.16990] Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation

arXiv - AI 4 min read Article

Summary

The paper introduces Conv-FinRe, a benchmark for evaluating financial recommendation systems that emphasizes utility-grounded decision-making over mere behavioral imitation.

Why It Matters

This research addresses the limitations of traditional recommendation benchmarks in finance, which often misinterpret user behavior as optimal decision-making. By focusing on long-term investment goals and risk preferences, Conv-FinRe provides a more accurate framework for assessing financial advisory models, which is crucial for improving AI-driven financial recommendations.

Key Takeaways

  • Conv-FinRe offers a new benchmark for evaluating financial recommendations based on utility rather than just user behavior.
  • The benchmark incorporates real market data and human decision-making processes to enhance model evaluation.
  • Models that prioritize rational decision-making may not align with user choices, highlighting a critical tension in financial AI.
  • The dataset and codebase are publicly available, promoting transparency and further research.
  • Understanding user risk preferences is essential for developing effective financial advisory systems.

Computer Science > Artificial Intelligence arXiv:2602.16990 (cs) [Submitted on 19 Feb 2026] Title:Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation Authors:Yan Wang, Yi Han, Lingfei Qian, Yueru He, Xueqing Peng, Dongji Feng, Zhuohan Xie, Vincent Jim Zhang, Rosie Guo, Fengran Mo, Jimin Huang, Yankai Chen, Xue Liu, Jian-Yun Nie View a PDF of the paper titled Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation, by Yan Wang and 13 other authors View PDF HTML (experimental) Abstract:Most recommendation benchmarks evaluate how well a model imitates user behavior. In financial advisory, however, observed actions can be noisy or short-sighted under market volatility and may conflict with a user's long-term goals. Treating what users chose as the sole ground truth, therefore, conflates behavioral imitation with decision quality. We introduce Conv-FinRe, a conversational and longitudinal benchmark for stock recommendation that evaluates LLMs beyond behavior matching. Given an onboarding interview, step-wise market context, and advisory dialogues, models must generate rankings over a fixed investment horizon. Crucially, Conv-FinRe provides multi-view references that distinguish descriptive behavior from normative utility grounded in investor-specific risk preferences, enabling diagnosis of whether an LLM follows rational analysis, mimics user noise, or is driven by market momentum. W...

Related Articles

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED
Llms

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED

The AI lab's Project Glasswing will bring together Apple, Google, and more than 45 other organizations. They'll use the new Claude Mythos...

Wired - AI · 7 min ·
Machine Learning

[for hire] Open for contracts – Veteran Data Scientist (AI / ML / OR) focused on delivering real‑world solutions.

Hi Reddit, I've spent 20 years working with data, and I've learned how to crack problems that AI systems struggle with. I've got a knack ...

Reddit - ML Jobs · 1 min ·
Llms

The public needs to control AI-run infrastructure, labor, education, and governance— NOT private actors

A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only qu...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] ICML final justification

Do we get notified if any reviewer put their final justification into their original review comment? submitted by /u/tuejan11 [link] [com...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime