[2509.25106] Towards Personalized Deep Research: Benchmarks and

[2509.25106] Towards Personalized Deep Research: Benchmarks and Evaluations

arXiv - AI March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2509.25106: Towards Personalized Deep Research: Benchmarks and Evaluations

Computer Science > Computation and Language arXiv:2509.25106 (cs) [Submitted on 29 Sep 2025 (v1), last revised 4 Mar 2026 (this version, v3)] Title:Towards Personalized Deep Research: Benchmarks and Evaluations Authors:Yuan Liang, Jiaxian Li, Yuqing Wang, Piaohong Wang, Motong Tian, Pai Liu, Shuofei Qiao, Runnan Fang, He Zhu, Ge Zhang, Minghao Liu, Yuchen Eleanor Jiang, Ningyu Zhang, Wangchunshu Zhou View a PDF of the paper titled Towards Personalized Deep Research: Benchmarks and Evaluations, by Yuan Liang and 13 other authors View PDF HTML (experimental) Abstract:Deep Research Agents (DRAs) can autonomously conduct complex investigations and generate comprehensive reports, demonstrating strong real-world potential. However, existing evaluations mostly rely on close-ended benchmarks, while open-ended deep research benchmarks remain scarce and typically neglect personalized scenarios. To bridge this gap, we introduce Personalized Deep Research Bench (PDR-Bench), the first benchmark for evaluating personalization in DRAs. It pairs 50 diverse research tasks across 10 domains with 25 authentic user profiles that combine structured persona attributes with dynamic real-world contexts, yielding 250 realistic user-task queries. To assess system performance, we propose the PQR Evaluation Framework, which jointly measures Personalization Alignment, Content Quality, and Factual Reliability. Our experiments on a range of systems highlight current capabilities and limitations in handl...

Originally published on March 05, 2026. Curated by AI News.

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min · 28 minutes ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 1 day ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · 2 days ago

Robotics

What Cities Need To Consider Before Allowing Self-Driving Cars

submitted by /u/timemagazine [link] [comments]

Reddit - Artificial Intelligence · 1 min · 2 days ago

[2509.25106] Towards Personalized Deep Research: Benchmarks and Evaluations

About this article

Related Articles

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

HALO - Hierarchical Autonomous Learning Organism

HALO - Hierarchical Autonomous Learning Organism

What Cities Need To Consider Before Allowing Self-Driving Cars

No comments

Stay updated with AI News