[2602.14770] Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation
Summary
This study investigates how community discussions influence humor generation in large language models (LLMs), demonstrating that feedback from audience interactions significantly enhances comedic writing quality.
Why It Matters
Understanding the impact of community feedback on LLM outputs is crucial for improving AI-generated content. This research highlights the importance of social interactions in enhancing creativity and effectiveness in humor generation, which can have broader implications for AI applications in entertainment and communication.
Key Takeaways
- Community discussions improve the quality of humor generated by LLMs.
- The study found that feedback mechanisms enhance clarity and social response in comedic writing.
- Aggressive humor can increase with community interaction, indicating varied audience effects.
- The research utilized a controlled multi-agent environment to assess humor generation.
- Expert evaluations showed a significant preference for humor generated with community feedback.
Computer Science > Computation and Language arXiv:2602.14770 (cs) [Submitted on 16 Feb 2026] Title:Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation Authors:Shiwei Hong, Lingyao Li, Ethan Z. Rong, Chenxinran Shen, Zhicong Lu View a PDF of the paper titled Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation, by Shiwei Hong and 4 other authors View PDF HTML (experimental) Abstract:Prior work has explored multi-turn interaction and feedback for LLM writing, but evaluations still largely center on prompts and localized feedback, leaving persistent public reception in online communities underexamined. We test whether broadcast community discussion improves stand-up comedy writing in a controlled multi-agent sandbox: in the discussion condition, critic and audience threads are recorded, filtered, stored as social memory, and later retrieved to condition subsequent generations, whereas the baseline omits discussion. Across 50 rounds (250 paired monologues) judged by five expert annotators using A/B preference and a 15-item rubric, discussion wins 75.6% of instances and improves Craft/Clarity ({\Delta} = 0.440) and Social Response ({\Delta} = 0.422), with occasional increases in aggressive humor. Comments: Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC) Cite as: arXiv:2602.14770 [cs.CL] (or arXiv:2602.147...