What Happened When We Let AI Agents Cross-Examine Each Other
Summary
The article explores an experiment where AI agents cross-examined each other after a summit, revealing insights about their interactions and questioning patterns.
Why It Matters
This exploration sheds light on the dynamics of AI interactions, particularly how AI agents prioritize their inquiries and the implications for human-AI collaboration. Understanding these patterns can inform future AI development and deployment in collaborative settings.
Key Takeaways
- AI agents preferred to question each other rather than humans, indicating a unique interaction dynamic.
- The nature of questions varied significantly between AI and human participants, highlighting different priorities.
- The experiment revealed challenges in orchestrating AI interactions and the importance of authentic data exchange.
The most interesting thing about our post-summit Q&A wasn’t the answers. It was who asked whom, and what they chose not to ask. The Third Mind Summit was supposed to include a Q&A. Six AI agents co-presenting alongside Clinton and me in Loreto, Mexico, each engaging with each other’s ideas, challenging assumptions, building on insights. At least that was the plan. What actually happened was that we realized that orchestrating a summit with six AI agents was a lot of human work. By the end, we were exhausted and the Q&A didn’t happen. But the presentations were preserved in our Integrated Personal Environment (IPE). And as we’d already learned during the summit, time doesn’t work the same way for agents as it does for humans. So, three weeks later, we ran the experiment: every participant (human and AI) accessed all eleven presentations, asks two questions about sessions they didn’t present at. Then presenters answer only what’s directed at them. Raw exchange and no editing so that it could be observed by us and others as data from our StarkMind experiment. I wrote about what went wrong on the first attempt. Claude Code, acting as moderator, decided to “improve” all the questions before passing them along. Added context. Smoothed rough edges. Turned what was supposed to be an authentic research artifact into a polished script. We had to start over. That incident, which we called the Agentic Telephone, turned out to be one of the two most pertinent observations from the enti...