Llms Machine Learning Ai Agents Ai Startups Generative Ai

[2508.11915] CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures

arXiv - Machine Learning February 24, 2026 4 min read Article

Summary

The paper introduces CORE, a metric for evaluating language quality in multi-agent LLM interactions under game-theoretic conditions, revealing insights on linguistic adaptation.

Why It Matters

Understanding the quality of interactions among multi-agent systems is crucial for developing more effective AI communication strategies. CORE provides a quantifiable measure that can enhance the design of LLMs, influencing their application in various fields, from gaming to collaborative AI systems.

Key Takeaways

CORE quantifies language use quality in multi-agent LLM interactions.
Cooperative interactions show higher vocabulary growth and repetition compared to competitive ones.
The metric integrates cluster entropy, lexical repetition, and semantic similarity for comprehensive analysis.
Findings highlight the impact of social incentives on language adaptation in AI systems.
CORE serves as a diagnostic tool for assessing linguistic robustness in AI interactions.

Computer Science > Computation and Language arXiv:2508.11915 (cs) [Submitted on 16 Aug 2025 (v1), last revised 22 Feb 2026 (this version, v2)] Title:CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures Authors:Punya Syon Pandey, Yongjin Yang, Jiarui Liu, Zhijing Jin View a PDF of the paper titled CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures, by Punya Syon Pandey and 3 other authors View PDF Abstract:Game-theoretic interactions between agents with Large Language Models (LLMs) have revealed many emergent capabilities, yet the linguistic diversity of these interactions has not been sufficiently quantified. In this paper, we present the Conversational Robustness Evaluation Score: CORE, a metric to quantify the effectiveness of language use within multi-agent systems across different game-theoretic interactions. CORE integrates measures of cluster entropy, lexical repetition, and semantic similarity, providing a direct lens of dialog quality. We apply CORE to pairwise LLM dialogs across competitive, cooperative, and neutral settings, further grounding our analysis in Zipf's and Heaps' Laws to characterize word frequency distributions and vocabulary growth. Our findings show that cooperative settings exhibit both steeper Zipf distributions and higher Heap exponents, indicating more repetition alongside greater vocabulary expansion. In contrast, competitive interactions display lower Zipf and Heaps exponents, ref...

Read Original Article

[2508.11915] CORE: Measuring Multi-Agent LLM Interaction Quality under Game-Theoretic Pressures

Summary

Why It Matters

Key Takeaways

Related Articles

What is AI, how do apps like ChatGPT work and why are there concerns?

[2603.29957] Think Anywhere in Code Generation

[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

[2512.21106] Semantic Refinement with LLMs for Graph Representations

No comments

Stay updated with AI News