Topic feed
Agents
AI agents, computer-use systems, and automation workflows.
The security bill is coming due for AI's agent era - Mitchell Bryson
AI agents are gaining deeper access to enterprise systems and developer environments faster than anyone is securing them. Three stories from a single news cycle show the attack surface widening in real time.
The Specification Gap: Why We Can't Tell AI Agents What We Actually Want - Mitchell Bryson
The hardest problem in agentic AI is not building capable agents — it is describing what we want them to do. Polanyi's Paradox, Goodhart's Law, and the limits of language converge to create a specification gap that no amount of engineering can close.
The agent security reckoning nobody is ready for - Mitchell Bryson
Three separate security disclosures this week exposed a pattern: we are deploying agentic AI infrastructure faster than we can secure it, from MCP servers to coding assistants.
The Decay Paradox: Why AI Agents Get Worse as We Trust Them More - Mitchell Bryson
Agentic AI systems degrade through context rot, compounding errors, and model drift — but human oversight erodes in lockstep. The widening gap between actual reliability and perceived reliability is the defining engineering challenge of autonomous systems.
OpenAI and Amazon announce strategic partnership
OpenAI and Amazon announce a strategic partnership bringing OpenAI’s Frontier platform to AWS, expanding AI infrastructure, custom models, and enterprise AI agents.
Pacific Northwest National Laboratory and OpenAI partner to accelerate federal permitting
OpenAI and Pacific Northwest National Laboratory introduce DraftNEPABench, a new benchmark evaluating how AI coding agents can accelerate federal permitting—showing potential to reduce NEPA drafting time by up to 15% and modernize infrastructure reviews.
The Multi-Agent Paradox: Why More AI Agents Don't Mean Better Results - Mitchell Bryson
Google's latest research shows multi-agent coordination can actually reduce performance, challenging the industry's $52 billion bet on orchestrated AI systems and revealing why coordination complexity may be the wrong path forward.
IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST
A Blog post by IBM Research on Hugging Face
OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments
We’re on a journey to advance and democratize artificial intelligence through open source and open science.