AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Robotics

What happens when AI agents can earn and spend real money? I built a small test to find out

I've been sitting with a question for a while: what happens when AI agents aren't just tools to be used, but participants in an economy? ...

Reddit - Artificial Intelligence · 1 min ·
[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction
Llms

[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

Abstract page for arXiv paper 2601.00809: A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

arXiv - AI · 4 min ·

All Content

[2602.22680] Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions
Llms

[2602.22680] Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions

This survey paper explores the development of personalized LLM-powered agents, focusing on their foundations, evaluation metrics, and fut...

arXiv - AI · 4 min ·
[2602.22556] Stable Adaptive Thinking via Advantage Shaping and Length-Aware Gradient Regulation
Machine Learning

[2602.22556] Stable Adaptive Thinking via Advantage Shaping and Length-Aware Gradient Regulation

The paper presents a two-stage framework for enhancing large reasoning models (LRMs) by addressing overthinking in low-complexity queries...

arXiv - AI · 3 min ·
[2602.22650] AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising
Ai Agents

[2602.22650] AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

The paper presents AHBid, a novel hierarchical bidding framework for cross-channel advertising that enhances budget allocation and adapta...

arXiv - AI · 4 min ·
[2602.22555] Autoregressive Visual Decoding from EEG Signals
Machine Learning

[2602.22555] Autoregressive Visual Decoding from EEG Signals

The paper presents AVDE, a novel framework for decoding visual information from EEG signals, addressing challenges in modality bridging a...

arXiv - AI · 4 min ·
[2602.22638] MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios
Llms

[2602.22638] MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

MobilityBench introduces a benchmark for evaluating LLM-based route-planning agents, addressing challenges in real-world mobility scenari...

arXiv - AI · 4 min ·
[2602.22603] SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning
Llms

[2602.22603] SideQuest: Model-Driven KV Cache Management for Long-Horizon Agentic Reasoning

The paper presents SideQuest, a novel model-driven approach for managing KV cache in long-horizon reasoning tasks, achieving significant ...

arXiv - Machine Learning · 3 min ·
[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance
Machine Learning

[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

This paper explores the concept of strategy executability in mathematical reasoning, highlighting the differences between human and model...

arXiv - AI · 4 min ·
[2602.22557] CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety
Llms

[2602.22557] CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

CourtGuard introduces a model-agnostic framework for zero-shot policy adaptation in LLM safety, enhancing adaptability and performance wi...

arXiv - Machine Learning · 3 min ·
[2602.22546] Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention
Llms

[2602.22546] Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

This article presents a framework called AHCE for enhancing Large Language Model (LLM) agents through effective human collaboration, sign...

arXiv - AI · 3 min ·
[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN
Llms

[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN

This paper presents an agentic AI framework for optimizing intent-driven operations in cell-free O-RAN, enhancing collaboration among age...

arXiv - AI · 4 min ·
[2602.22523] Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents
Llms

[2602.22523] Cognitive Models and AI Algorithms Provide Templates for Designing Language Agents

The paper discusses how cognitive models and AI algorithms can serve as templates for designing modular language agents, addressing limit...

arXiv - AI · 3 min ·
[2602.22519] A Mathematical Theory of Agency and Intelligence
Ai Agents

[2602.22519] A Mathematical Theory of Agency and Intelligence

This paper presents a mathematical framework for understanding agency and intelligence in AI systems, introducing the concept of bipredic...

arXiv - AI · 4 min ·
[2602.22508] Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models
Llms

[2602.22508] Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models

The paper presents Metacognitive Behavioral Tuning (MBT), a framework designed to enhance large reasoning models by incorporating human-l...

arXiv - AI · 3 min ·
[2602.22480] VeRO: An Evaluation Harness for Agents to Optimize Agents
Llms

[2602.22480] VeRO: An Evaluation Harness for Agents to Optimize Agents

The paper introduces VeRO, an evaluation harness designed for optimizing coding agents through structured evaluation and benchmarking, ad...

arXiv - Machine Learning · 3 min ·
[2602.22465] ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization
Llms

[2602.22465] ConstraintBench: Benchmarking LLM Constraint Reasoning on Direct Optimization

The paper introduces ConstraintBench, a benchmark designed to evaluate large language models (LLMs) on direct constrained optimization ta...

arXiv - AI · 4 min ·
[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines
Machine Learning

[2602.22452] CWM: Contrastive World Models for Action Feasibility Learning in Embodied Agent Pipelines

The paper presents Contrastive World Models (CWM) for enhancing action feasibility learning in embodied agents, improving action scoring ...

arXiv - AI · 4 min ·
[2602.22442] A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines
Llms

[2602.22442] A Framework for Assessing AI Agent Decisions and Outcomes in AutoML Pipelines

This article presents a framework for evaluating AI agent decisions in AutoML pipelines, emphasizing decision-centric metrics over tradit...

arXiv - AI · 4 min ·
[2602.22441] How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?
Machine Learning

[2602.22441] How Do Latent Reasoning Methods Perform Under Weak and Strong Supervision?

This paper analyzes latent reasoning methods under varying supervision levels, revealing key issues like shortcut behavior and the trade-...

arXiv - Machine Learning · 4 min ·
[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery
Generative Ai

[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery

ArchAgent is an AI-driven system that automates computer architecture discovery, achieving significant performance improvements in cache ...

arXiv - AI · 4 min ·
[2602.22413] Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents
Ai Safety

[2602.22413] Epistemic Filtering and Collective Hallucination: A Jury Theorem for Confidence-Calibrated Agents

This paper explores a probabilistic framework for collective decision-making among agents that can assess their own reliability and selec...

arXiv - AI · 3 min ·
Previous Page 40 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime