AI Agents
Autonomous agents, tool use, and agentic systems
Top This Week
Considering NeurIPS submission [D]
Wondering if it worth submitting paper I’m working on to NeurIPS. I have formal mathematical proof for convergence of a novel agentic sys...
Anthropic cuts off the ability to use Claude subscriptions with OpenClaw and third-party AI agents
All Content
[2511.17439] InTAct: Interval-based Task Activation Consolidation for Continual Learning
The paper presents InTAct, a novel method for continual learning that mitigates catastrophic forgetting by using interval-based task acti...
[2511.12779] Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation
This paper presents a novel approach to multi-objective reinforcement learning by introducing a two-stage procedure that efficiently esti...
[2511.08094] Stuart-Landau Oscillatory Graph Neural Network
The paper introduces the Stuart-Landau Oscillatory Graph Neural Network (SLGNN), a novel architecture that addresses oversmoothing and va...
[2511.07730] Multistep Quasimetric Learning for Scalable Goal-conditioned Reinforcement Learning
This paper presents a novel approach to goal-conditioned reinforcement learning (GCRL) using multistep quasimetric learning, demonstratin...
[2511.05640] Blind Inverse Game Theory: Jointly Decoding Rewards and Rationality in Entropy-Regularized Competitive Games
This paper presents Blind Inverse Game Theory, a novel framework for jointly decoding rewards and rationality in entropy-regularized comp...
[2211.12817] Learning to See the Elephant in the Room: Self-Supervised Context Reasoning in Humans and AI
This article explores self-supervised context reasoning in humans and AI, presenting a model called SeCo that learns contextual relations...
[2511.04847] Test-Time Adaptation for LLM Agents via Environment Interaction
This paper presents strategies for adapting large language model (LLM) agents to new environments during deployment, addressing challenge...
[2204.07520] Resource-Aware Distributed Submodular Maximization: A Paradigm for Multi-Robot Decision-Making
This paper presents a novel algorithm for resource-aware distributed submodular maximization, enhancing multi-robot decision-making by ba...
[2510.15940] Lean Finder: Semantic Search for Mathlib That Understands User Intents
Lean Finder is a semantic search engine designed for the Lean programming language and mathlib, improving theorem retrieval by understand...
[2510.22512] Transitive RL: Value Learning via Divide and Conquer
The paper introduces Transitive Reinforcement Learning (TRL), a novel value learning algorithm that enhances offline goal-conditioned rei...
[2510.03817] TROLL: Trust Regions improve Reinforcement Learning for Large Language Models
The paper presents TROLL, a novel method that replaces traditional PPO-like clipping in reinforcement learning with a trust region optimi...
[2602.12268] CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use
The paper presents CM2, a novel reinforcement learning framework that utilizes checklist rewards to enhance multi-turn and multi-step too...
[2510.03346] KVComm: Enabling Efficient LLM Communication through Selective KV Sharing
The paper introduces KVComm, a novel framework for efficient communication between Large Language Models (LLMs) using selective KV pair s...
[2602.11767] TSR: Trajectory-Search Rollouts for Multi-Turn RL of LLM Agents
The paper introduces TSR (Trajectory-Search Rollouts), a novel approach to enhance multi-turn reinforcement learning for large language m...
[2602.11298] Voxtral Realtime
Voxtral Realtime presents a novel streaming automatic speech recognition model achieving offline transcription quality with sub-second la...
[2602.08354] Does Your Reasoning Model Implicitly Know When to Stop Thinking?
This article explores how large reasoning models (LRMs) can implicitly determine when to stop processing information, introducing a new s...
[2510.00553] On Predictability of Reinforcement Learning Dynamics for Large Language Models
This article explores the predictability of reinforcement learning dynamics in large language models (LLMs), highlighting key properties ...
[2602.08104] Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems
This paper presents a novel framework for interpretable failure analysis in Multi-Agent Reinforcement Learning (MARL) systems, focusing o...
[2602.07883] ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Intrinsic Adaptation
The article introduces ToolSelf, a novel framework for enhancing agentic systems using Large Language Models (LLMs) by enabling runtime s...
[2509.25424] Polychromic Objectives for Reinforcement Learning
The paper introduces polychromic objectives for reinforcement learning, enhancing policy diversity and exploration in pretrained models, ...
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime