Llms Machine Learning Nlp Data Science Generative Ai Ai Safety

[2602.18462] Assessing the Reliability of Persona-Conditioned LLMs as Synthetic Survey Respondents

arXiv - AI February 24, 2026 3 min read Article

Summary

This article evaluates the reliability of persona-conditioned large language models (LLMs) as synthetic survey respondents, revealing that persona prompting may not enhance survey alignment and can distort results.

Why It Matters

Understanding the reliability of LLMs in survey contexts is crucial for researchers in computational social science. The findings challenge the effectiveness of persona conditioning, highlighting potential biases and inaccuracies that could mislead analyses and decision-making.

Key Takeaways

Persona prompting does not consistently improve LLM reliability in surveys.
In many cases, persona conditioning can degrade performance and introduce biases.
Demographic conditioning can redistribute errors, affecting subgroup fidelity.
Most survey items show minimal change, but some experience significant distortions.
The study emphasizes the need for careful evaluation of simulation practices in social science.

Computer Science > Computers and Society arXiv:2602.18462 (cs) [Submitted on 6 Feb 2026] Title:Assessing the Reliability of Persona-Conditioned LLMs as Synthetic Survey Respondents Authors:Erika Elizabeth Taday Morocho, Lorenzo Cima, Tiziano Fagni, Marco Avvenuti, Stefano Cresci View a PDF of the paper titled Assessing the Reliability of Persona-Conditioned LLMs as Synthetic Survey Respondents, by Erika Elizabeth Taday Morocho and 4 other authors View PDF HTML (experimental) Abstract:Using persona-conditioned LLMs as synthetic survey respondents has become a common practice in computational social science and agent-based simulations. Yet, it remains unclear whether multi-attribute persona prompting improves LLM reliability or instead introduces distortions. Here we contribute to this assessment by leveraging a large dataset of U.S. microdata from the World Values Survey. Concretely, we evaluate two open-weight chat models and a random-guesser baseline across more than 70K respondent-item instances. We find that persona prompting does not yield a clear aggregate improvement in survey alignment and, in many cases, significantly degrades performance. Persona effects are highly heterogeneous as most items exhibit minimal change, while a small subset of questions and underrepresented subgroups experience disproportionate distortions. Our findings highlight a key adverse impact of current persona-based simulation practices: demographic conditioning can redistribute error in ways...

Read Original Article

[2602.18462] Assessing the Reliability of Persona-Conditioned LLMs as Synthetic Survey Respondents

Summary

Why It Matters

Key Takeaways

Related Articles

[P] Remote sensing foundation models made easy to use.

I stopped using Claude like a chatbot — 7 prompt shifts that reclaimed 10 hours of my week

What features do you actually want in an AI chatbot that nobody has built yet?

So, what exactly is going on with the Claude usage limits?

No comments

Stay updated with AI News