AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Robotics

What happens when AI agents can earn and spend real money? I built a small test to find out

I've been sitting with a question for a while: what happens when AI agents aren't just tools to be used, but participants in an economy? ...

Reddit - Artificial Intelligence · 1 min ·
[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction
Llms

[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

Abstract page for arXiv paper 2601.00809: A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

arXiv - AI · 4 min ·

All Content

[2602.23092] Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design
Llms

[2602.23092] Enhancing CVRP Solver through LLM-driven Automatic Heuristic Design

This paper introduces AILS-AHD, a novel approach that utilizes Large Language Models to enhance the Capacitated Vehicle Routing Problem (...

arXiv - AI · 3 min ·
[2602.23056] Learning-based Multi-agent Race Strategies in Formula 1
Ai Agents

[2602.23056] Learning-based Multi-agent Race Strategies in Formula 1

This paper presents a reinforcement learning approach to optimize multi-agent race strategies in Formula 1, focusing on energy management...

arXiv - AI · 3 min ·
[2602.22719] Interpreting and Steering State-Space Models via Activation Subspace Bottlenecks
Llms

[2602.22719] Interpreting and Steering State-Space Models via Activation Subspace Bottlenecks

This paper explores the interpretability and steerability of state-space models (SSMs) by identifying activation subspace bottlenecks and...

arXiv - Machine Learning · 3 min ·
[2602.22973] Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots
Machine Learning

[2602.22973] Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots

The paper presents a framework for improving AI diagnostic alignment in clinical settings by preserving AI-generated reports as immutable...

arXiv - AI · 4 min ·
[2602.22703] Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning
Llms

[2602.22703] Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning

The paper presents GeoPerceive, a benchmark for evaluating geometric perception in vision-language models (VLMs), and introduces GeoDPO, ...

arXiv - Machine Learning · 4 min ·
[2602.22971] SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy
Llms

[2602.22971] SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy

The paper presents SPM-Bench, a benchmark for evaluating large language models in scanning probe microscopy, addressing gaps in existing ...

arXiv - AI · 4 min ·
[2602.22963] FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning
Llms

[2602.22963] FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

FactGuard introduces an innovative framework for detecting video misinformation using reinforcement learning, enhancing the capabilities ...

arXiv - AI · 3 min ·
[2602.22953] General Agent Evaluation
Llms

[2602.22953] General Agent Evaluation

This paper introduces a framework for evaluating general-purpose agents, proposing a Unified Protocol and Exgentic framework, and benchma...

arXiv - AI · 3 min ·
[2602.22673] Forecasting Antimicrobial Resistance Trends Using Machine Learning on WHO GLASS Surveillance Data: A Retrieval-Augmented Generation Approach for Policy Decision Support
Machine Learning

[2602.22673] Forecasting Antimicrobial Resistance Trends Using Machine Learning on WHO GLASS Surveillance Data: A Retrieval-Augmented Generation Approach for Policy Decision Support

This article presents a machine learning framework for forecasting antimicrobial resistance (AMR) trends using WHO GLASS data, highlighti...

arXiv - Machine Learning · 4 min ·
[2602.22897] OmniGAIA: Towards Native Omni-Modal AI Agents
Llms

[2602.22897] OmniGAIA: Towards Native Omni-Modal AI Agents

The paper introduces OmniGAIA, a benchmark for evaluating omni-modal AI agents that integrate vision, audio, and language for complex rea...

arXiv - Machine Learning · 3 min ·
[2602.22879] Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space
Llms

[2602.22879] Towards LLM-Empowered Knowledge Tracing via LLM-Student Hierarchical Behavior Alignment in Hyperbolic Space

This article presents a novel approach to knowledge tracing using a Large Language Model (LLM) to enhance the understanding of student le...

arXiv - AI · 4 min ·
[2602.22842] The AI Research Assistant: Promise, Peril, and a Proof of Concept
Ai Agents

[2602.22842] The AI Research Assistant: Promise, Peril, and a Proof of Concept

This article explores the role of AI in mathematical research, highlighting both its capabilities and limitations through a case study on...

arXiv - AI · 3 min ·
[2602.22839] DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation
Ai Agents

[2602.22839] DeepPresenter: Environment-Grounded Reflection for Agentic Presentation Generation

DeepPresenter introduces an innovative framework for generating presentations that adapts to user needs and incorporates environmental fe...

arXiv - AI · 3 min ·
[2602.22623] ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL
Llms

[2602.22623] ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL

The paper presents ContextRL, a framework that enhances knowledge discovery efficiency in multi-layered language models (MLLMs) through c...

arXiv - AI · 4 min ·
[2602.22814] When Should an AI Act? A Human-Centered Model of Scene, Context, and Behavior for Agentic AI Design
Machine Learning

[2602.22814] When Should an AI Act? A Human-Centered Model of Scene, Context, and Behavior for Agentic AI Design

This article presents a human-centered model for agentic AI design, focusing on when AI should act based on contextual understanding and ...

arXiv - AI · 3 min ·
[2602.22808] MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks
Llms

[2602.22808] MiroFlow: Towards High-Performance and Robust Open-Source Agent Framework for General Deep Research Tasks

MiroFlow is an innovative open-source agent framework designed to enhance the performance and robustness of large language models in comp...

arXiv - AI · 3 min ·
[2602.22769] AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications
Llms

[2602.22769] AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications

The paper introduces AMA-Bench, a new benchmark for evaluating long-horizon memory in Large Language Models (LLMs) for agentic applicatio...

arXiv - Machine Learning · 4 min ·
[2602.22751] Know What You Know: Metacognitive Entropy Calibration for Verifiable RL Reasoning
Machine Learning

[2602.22751] Know What You Know: Metacognitive Entropy Calibration for Verifiable RL Reasoning

The paper proposes EGPO, a metacognitive entropy calibration framework that integrates intrinsic uncertainty into reinforcement learning ...

arXiv - AI · 4 min ·
[2602.22581] IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck
Llms

[2602.22581] IBCircuit: Towards Holistic Circuit Discovery with Information Bottleneck

The paper presents IBCircuit, a novel framework for holistic circuit discovery in machine learning models using the Information Bottlenec...

arXiv - Machine Learning · 3 min ·
[2602.22702] Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics
Machine Learning

[2602.22702] Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics

The paper introduces 'Knob', a physics-inspired framework that enhances neural network calibration by allowing dynamic adjustments to mod...

arXiv - AI · 4 min ·
Previous Page 39 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime