[2603.09643] MM-tau-p$^2$: Persona-Adaptive Prompting for Robust

[2603.09643] MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

arXiv - AI April 08, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.09643: MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

Computer Science > Emerging Technologies arXiv:2603.09643 (cs) [Submitted on 10 Mar 2026 (v1), last revised 7 Apr 2026 (this version, v4)] Title:MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings Authors:Anupam Purwar, Aditya Choudhary View a PDF of the paper titled MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings, by Anupam Purwar and 1 other authors View PDF HTML (experimental) Abstract:Current evaluation frameworks and benchmarks for LLM powered agents focus on text chat driven agents, these frameworks do not expose the persona of user to the agent, thus operating in a user agnostic environment. Importantly, in customer experience management domain, the agent's behaviour evolves as the agent learns about user personality. With proliferation of real time TTS and multi-modal language models, LLM based agents are gradually going to become multi-modal. Towards this, we propose the MM-tau-p$^2$ benchmark with metrics for evaluating the robustness of multi-modal agents in dual control setting with and without persona adaption of user, while also taking user inputs in the planning process to resolve a user query. In particular, our work shows that even with state of-the-art frontier LLMs like GPT-5, GPT 4.1, there are additional considerations measured using metrics viz. multi-modal robustness, turn overhead while introducing multi-modality into LLM based agents. Ove...

Originally published on April 08, 2026. Curated by AI News.

Llms

[2603.16105] Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

Abstract page for arXiv paper 2603.16105: Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

arXiv - AI · 4 min · about 2 hours ago

Llms

[2603.07339] Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

Abstract page for arXiv paper 2603.07339: Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

arXiv - AI · 4 min · about 2 hours ago

Llms

[2602.00185] QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities

Abstract page for arXiv paper 2602.00185: QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities

arXiv - AI · 4 min · about 2 hours ago

Llms

[2601.11652] WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

Abstract page for arXiv paper 2601.11652: WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dy...

arXiv - AI · 4 min · about 2 hours ago

[2603.09643] MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

About this article

Related Articles

[2603.16105] Frequency Matters: Fast Model-Agnostic Data Curation for Pruning and Quantization

[2603.07339] Agora: Teaching the Skill of Consensus-Finding with AI Personas Grounded in Human Voice

[2602.00185] QUASAR: A Universal Autonomous System for Atomistic Simulation and a Benchmark of Its Capabilities

[2601.11652] WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

No comments

Stay updated with AI News