Agents page 24

OpenAI News March 05, 2026 10:00

Introducing GPT-5.4

Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.

Models Agents

OpenAI Models Agents

Mitchell Bryson AI Articles March 04, 2026 00:00

The security bill is coming due for AI's agent era - Mitchell Bryson

AI agents are gaining deeper access to enterprise systems and developer environments faster than anyone is securing them. Three stories from a single news cycle show the attack surface widening in real time.

Agents

Mitchell Bryson AI Articles March 04, 2026 00:00

The Specification Gap: Why We Can't Tell AI Agents What We Actually Want - Mitchell Bryson

The hardest problem in agentic AI is not building capable agents — it is describing what we want them to do. Polanyi's Paradox, Goodhart's Law, and the limits of language converge to create a specification gap that no amount of engineering can close.

Agents

Mitchell Bryson AI Articles March 02, 2026 00:00

The agent security reckoning nobody is ready for - Mitchell Bryson

Three separate security disclosures this week exposed a pattern: we are deploying agentic AI infrastructure faster than we can secure it, from MCP servers to coding assistants.

Agents Infrastructure

Mitchell Bryson AI Articles March 01, 2026 00:00

The Decay Paradox: Why AI Agents Get Worse as We Trust Them More - Mitchell Bryson

Agentic AI systems degrade through context rot, compounding errors, and model drift — but human oversight erodes in lockstep. The widening gap between actual reliability and perceived reliability is the defining engineering challenge of autonomous systems.

Agents

OpenAI News February 27, 2026 05:30

OpenAI and Amazon announce strategic partnership

OpenAI and Amazon announce a strategic partnership bringing OpenAI’s Frontier platform to AWS, expanding AI infrastructure, custom models, and enterprise AI agents.

Models Agents Infrastructure

OpenAI Models Agents Infrastructure

OpenAI News February 27, 2026 05:30

Introducing the Stateful Runtime Environment for Agents in Amazon Bedrock

Stateful Runtime for Agents in Amazon Bedrock brings persistent orchestration, memory, and secure execution to multi-step AI workflows powered by OpenAI.

Models Agents

OpenAI Models Agents

OpenAI News February 26, 2026 10:00

Pacific Northwest National Laboratory and OpenAI partner to accelerate federal permitting

OpenAI and Pacific Northwest National Laboratory introduce DraftNEPABench, a new benchmark evaluating how AI coding agents can accelerate federal permitting—showing potential to reduce NEPA drafting time by up to 15% and modernize infrastructure reviews.

Models Agents Infrastructure

OpenAI Models Agents Infrastructure

Mitchell Bryson AI Articles February 25, 2026 00:00

The Multi-Agent Paradox: Why More AI Agents Don't Mean Better Results - Mitchell Bryson

Google's latest research shows multi-agent coordination can actually reduce performance, challenging the industry's $52 billion bet on orchestrated AI systems and revealing why coordination complexity may be the wrong path forward.

Agents

Agents Google

Hugging Face Blog February 18, 2026 16:15

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

A Blog post by IBM Research on Hugging Face

Agents

OpenAI News February 18, 2026 00:00

Introducing EVMbench

OpenAI and Paradigm introduce EVMbench, a benchmark evaluating AI agents’ ability to detect, patch, and exploit high-severity smart contract vulnerabilities.

Models Agents

OpenAI Models Agents

Hugging Face Blog February 12, 2026 00:00

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Agents