[2510.20091] CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity

[2510.20091] CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity

arXiv - AI 4 min read Article

Summary

The paper presents CreativityPrism, a comprehensive framework for evaluating the creativity of large language models (LLMs) across various tasks, addressing the limitations of existing evaluation methods.

Why It Matters

As LLMs become integral in generating creative content, a standardized evaluation framework is crucial for assessing their capabilities. CreativityPrism offers a structured approach that enhances the understanding of LLM performance in diverse creative domains, which is essential for developers and researchers in AI.

Key Takeaways

  • CreativityPrism consolidates evaluation tasks into a holistic framework.
  • The framework emphasizes quality, novelty, and diversity in LLM outputs.
  • Proprietary LLMs outperform open-source models in creative writing and logical reasoning.
  • High performance in one creative dimension does not guarantee success in others.
  • A multi-dimensional evaluation approach is necessary for meaningful assessments.

Computer Science > Computation and Language arXiv:2510.20091 (cs) [Submitted on 23 Oct 2025 (v1), last revised 17 Feb 2026 (this version, v2)] Title:CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity Authors:Zhaoyi Joey Hou, Bowei Alvin Zhang, Yining Lu, Bhiman Kumar Baghel, Anneliese Brei, Ximing Lu, Meng Jiang, Faeze Brahman, Snigdha Chaturvedi, Haw-Shiuan Chang, Daniel Khashabi, Xiang Lorraine Li View a PDF of the paper titled CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity, by Zhaoyi Joey Hou and 11 other authors View PDF HTML (experimental) Abstract:Creativity is often seen as a hallmark of human intelligence. While large language models (LLMs) are increasingly perceived as generating creative text, there is still no holistic and scalable framework to evaluate their creativity across diverse scenarios. Existing methods of LLM creativity evaluation either heavily rely on humans, limiting speed and scalability, or are fragmented across different domains and different definitions of creativity. To address this gap, we propose CREATIVITYPRISM, an evaluation analysis framework that consolidates eight tasks from three domains, divergent thinking, creative writing, and logical reasoning, into a taxonomy of creativity that emphasizes three dimensions: quality, novelty, and diversity of LLM generations. The framework is designed to be scalable with reliable automatic evaluation judges that have been validat...

Related Articles

Llms

I think we’re about to have a new kind of “SEO”… and nobody is talking about it.

More people are asking ChatGPT things like: “what’s the best CRM?” “is this tool worth it?” “alternatives to X” And they just… trust the ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Why would Claude give me the same response over and over and give others different replies?

I asked Claude to "generate me a random word" so I could do some word play. Then I asked it again in a new prompt window on desktop after...

Reddit - Artificial Intelligence · 1 min ·
Anthropic essentially bans OpenClaw from Claude by making subscribers pay extra | The Verge
Llms

Anthropic essentially bans OpenClaw from Claude by making subscribers pay extra | The Verge

The popular combination of OpenClaw and Claude Code is being severed now that Anthropic has announced it will start charging subscribers ...

The Verge - AI · 4 min ·
Llms

wtf bro did what? arc 3 2026

The Physarum Explorer is a high-speed, bio-inspired neural model designed specifically for ARC geometry. Here is the snapshot of its curr...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime