AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Robotics

What happens when AI agents can earn and spend real money? I built a small test to find out

I've been sitting with a question for a while: what happens when AI agents aren't just tools to be used, but participants in an economy? ...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

Abstract page for arXiv paper 2601.00809: A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

arXiv - AI · 4 min · about 10 hours ago

All Content

Llms

[2602.22680] Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions

This survey paper explores the development of personalized LLM-powered agents, focusing on their foundations, evaluation metrics, and fut...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22556] Stable Adaptive Thinking via Advantage Shaping and Length-Aware Gradient Regulation

The paper presents a two-stage framework for enhancing large reasoning models (LRMs) by addressing overthinking in low-complexity queries...

arXiv - AI · 3 min · about 1 month ago

Ai Agents

[2602.22650] AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

The paper presents AHBid, a novel hierarchical bidding framework for cross-channel advertising that enhances budget allocation and adapta...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22555] Autoregressive Visual Decoding from EEG Signals

The paper presents AVDE, a novel framework for decoding visual information from EEG signals, addressing challenges in modality bridging a...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22638] MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

MobilityBench introduces a benchmark for evaluating LLM-based route-planning agents, addressing challenges in real-world mobility scenari...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22603] SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning

The paper presents SideQuest, a novel model-driven approach for managing KV cache in long-horizon reasoning tasks, achieving significant ...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

This paper explores the concept of strategy executability in mathematical reasoning, highlighting the differences between human and model...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22557] CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

CourtGuard introduces a model-agnostic framework for zero-shot policy adaptation in LLM safety, enhancing adaptability and performance wi...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.22546] Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

This article presents a framework called AHCE for enhancing Large Language Model (LLM) agents through effective human collaboration, sign...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN

This paper presents an agentic AI framework for optimizing intent-driven operations in cell-free O-RAN, enhancing collaboration among age...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22523] Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents

The paper discusses how cognitive models and AI algorithms can serve as templates for designing modular language agents, addressing limit...

arXiv - AI · 3 min · about 1 month ago

Ai Agents

[2602.22519] A Mathematical Theory of Agency and Intelligence

This paper presents a mathematical framework for understanding agency and intelligence in AI systems, introducing the concept of bipredic...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22508] Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models

The paper presents Metacognitive Behavioral Tuning (MBT), a framework designed to enhance large reasoning models by incorporating human-l...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.22480] VeRO: An Evaluation Harness for Agents to Optimize Agents

The paper introduces VeRO, an evaluation harness designed for optimizing coding agents through structured evaluation and benchmarking, ad...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.22465] ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization

The paper introduces ConstraintBench, a benchmark designed to evaluate large language models (LLMs) on direct constrained optimization ta...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines

The paper presents Contrastive World Models (CWM) for enhancing action feasibility learning in embodied agents, improving action scoring ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22442] A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines

This article presents a framework for evaluating AI agent decisions in AutoML pipelines, emphasizing decision-centric metrics over tradit...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22441] How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?

This paper analyzes latent reasoning methods under varying supervision levels, revealing key issues like shortcut behavior and the trade-...

arXiv - Machine Learning · 4 min · about 1 month ago

Generative Ai

[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery

ArchAgent is an AI-driven system that automates computer architecture discovery, achieving significant performance improvements in cache ...

arXiv - AI · 4 min · about 1 month ago

Ai Safety

[2602.22413] Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

This paper explores a probabilistic framework for collective decision-making among agents that can assess their own reliability and selec...

arXiv - AI · 3 min · about 1 month ago

Previous Page 40 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

What I learned about multi-agent coordination running 9 specialized Claude agents

What happens when AI agents can earn and spend real money? I built a small test to find out

[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

All Content

[2602.22680] Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions

[2602.22556] Stable Adaptive Thinking via Advantage Shaping and Length-Aware Gradient Regulation

[2602.22650] AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

[2602.22555] Autoregressive Visual Decoding from EEG Signals

[2602.22638] MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

[2602.22603] SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning

[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

[2602.22557] CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

[2602.22546] Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN

[2602.22523] Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents

[2602.22519] A Mathematical Theory of Agency and Intelligence

[2602.22508] Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models

[2602.22480] VeRO: An Evaluation Harness for Agents to Optimize Agents

[2602.22465] ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization

[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines

[2602.22442] A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines

[2602.22441] How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?

[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery

[2602.22413] Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

Related Topics

Stay updated with AI News