AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Ai Agents

"They operate like slot machines": AI agents are scrambling power users' brains

AI Tools & Products ·
Ai Agents

Considering NeurIPS submission [D]

Wondering if it worth submitting paper I’m working on to NeurIPS. I have formal mathematical proof for convergence of a novel agentic sys...

Reddit - Machine Learning · 1 min ·
Llms

Anthropic cuts off the ability to use Claude subscriptions with OpenClaw and third-party AI agents

AI Tools & Products ·

All Content

[2511.17439] InTAct: Interval-based Task Activation Consolidation for Continual Learning
Ai Infrastructure

[2511.17439] InTAct: Interval-based Task Activation Consolidation for Continual Learning

The paper presents InTAct, a novel method for continual learning that mitigates catastrophic forgetting by using interval-based task acti...

arXiv - AI · 4 min ·
[2511.12779] Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation
Llms

[2511.12779] Scalable Multi-Objective and Meta Reinforcement Learning via Gradient Estimation

This paper presents a novel approach to multi-objective reinforcement learning by introducing a two-stage procedure that efficiently esti...

arXiv - AI · 4 min ·
[2511.08094] Stuart-Landau Oscillatory Graph Neural Network
Machine Learning

[2511.08094] Stuart-Landau Oscillatory Graph Neural Network

The paper introduces the Stuart-Landau Oscillatory Graph Neural Network (SLGNN), a novel architecture that addresses oversmoothing and va...

arXiv - Machine Learning · 4 min ·
[2511.07730] Multistep Quasimetric Learning for Scalable Goal-conditioned Reinforcement Learning
Nlp

[2511.07730] Multistep Quasimetric Learning for Scalable Goal-conditioned Reinforcement Learning

This paper presents a novel approach to goal-conditioned reinforcement learning (GCRL) using multistep quasimetric learning, demonstratin...

arXiv - Machine Learning · 4 min ·
[2511.05640] Blind Inverse Game Theory: Jointly Decoding Rewards and Rationality in Entropy-Regularized Competitive Games
Machine Learning

[2511.05640] Blind Inverse Game Theory: Jointly Decoding Rewards and Rationality in Entropy-Regularized Competitive Games

This paper presents Blind Inverse Game Theory, a novel framework for jointly decoding rewards and rationality in entropy-regularized comp...

arXiv - Machine Learning · 4 min ·
[2211.12817] Learning to See the Elephant in the Room: Self-Supervised Context Reasoning in Humans and AI
Machine Learning

[2211.12817] Learning to See the Elephant in the Room: Self-Supervised Context Reasoning in Humans and AI

This article explores self-supervised context reasoning in humans and AI, presenting a model called SeCo that learns contextual relations...

arXiv - AI · 4 min ·
[2511.04847] Test-Time Adaptation for LLM Agents via Environment Interaction
Llms

[2511.04847] Test-Time Adaptation for LLM Agents via Environment Interaction

This paper presents strategies for adapting large language model (LLM) agents to new environments during deployment, addressing challenge...

arXiv - Machine Learning · 4 min ·
[2204.07520] Resource-Aware Distributed Submodular Maximization: A Paradigm for Multi-Robot Decision-Making
Robotics

[2204.07520] Resource-Aware Distributed Submodular Maximization: A Paradigm for Multi-Robot Decision-Making

This paper presents a novel algorithm for resource-aware distributed submodular maximization, enhancing multi-robot decision-making by ba...

arXiv - AI · 4 min ·
[2510.15940] Lean Finder: Semantic Search for Mathlib That Understands User Intents
Nlp

[2510.15940] Lean Finder: Semantic Search for Mathlib That Understands User Intents

Lean Finder is a semantic search engine designed for the Lean programming language and mathlib, improving theorem retrieval by understand...

arXiv - AI · 4 min ·
[2510.22512] Transitive RL: Value Learning via Divide and Conquer
Machine Learning

[2510.22512] Transitive RL: Value Learning via Divide and Conquer

The paper introduces Transitive Reinforcement Learning (TRL), a novel value learning algorithm that enhances offline goal-conditioned rei...

arXiv - AI · 3 min ·
[2510.03817] TROLL: Trust Regions improve Reinforcement Learning for Large Language Models
Llms

[2510.03817] TROLL: Trust Regions improve Reinforcement Learning for Large Language Models

The paper presents TROLL, a novel method that replaces traditional PPO-like clipping in reinforcement learning with a trust region optimi...

arXiv - Machine Learning · 4 min ·
[2602.12268] CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use
Ai Agents

[2602.12268] CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use

The paper presents CM2, a novel reinforcement learning framework that utilizes checklist rewards to enhance multi-turn and multi-step too...

arXiv - AI · 4 min ·
[2510.03346] KVComm: Enabling Efficient LLM Communication through Selective KV Sharing
Llms

[2510.03346] KVComm: Enabling Efficient LLM Communication through Selective KV Sharing

The paper introduces KVComm, a novel framework for efficient communication between Large Language Models (LLMs) using selective KV pair s...

arXiv - AI · 4 min ·
[2602.11767] TSR: Trajectory-Search Rollouts for Multi-Turn RL of LLM Agents
Llms

[2602.11767] TSR: Trajectory-Search Rollouts for Multi-Turn RL of LLM Agents

The paper introduces TSR (Trajectory-Search Rollouts), a novel approach to enhance multi-turn reinforcement learning for large language m...

arXiv - Machine Learning · 4 min ·
[2602.11298] Voxtral Realtime
Machine Learning

[2602.11298] Voxtral Realtime

Voxtral Realtime presents a novel streaming automatic speech recognition model achieving offline transcription quality with sub-second la...

arXiv - AI · 5 min ·
[2602.08354] Does Your Reasoning Model Implicitly Know When to Stop Thinking?
Machine Learning

[2602.08354] Does Your Reasoning Model Implicitly Know When to Stop Thinking?

This article explores how large reasoning models (LRMs) can implicitly determine when to stop processing information, introducing a new s...

arXiv - AI · 4 min ·
[2510.00553] On Predictability of Reinforcement Learning Dynamics for Large Language Models
Llms

[2510.00553] On Predictability of Reinforcement Learning Dynamics for Large Language Models

This article explores the predictability of reinforcement learning dynamics in large language models (LLMs), highlighting key properties ...

arXiv - AI · 4 min ·
[2602.08104] Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems
Ai Agents

[2602.08104] Interpretable Failure Analysis in Multi-Agent Reinforcement Learning Systems

This paper presents a novel framework for interpretable failure analysis in Multi-Agent Reinforcement Learning (MARL) systems, focusing o...

arXiv - Machine Learning · 4 min ·
[2602.07883] ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Intrinsic Adaptation
Llms

[2602.07883] ToolSelf: Unifying Task Execution and Self-Reconfiguration via Tool-Driven Intrinsic Adaptation

The article introduces ToolSelf, a novel framework for enhancing agentic systems using Large Language Models (LLMs) by enabling runtime s...

arXiv - AI · 4 min ·
[2509.25424] Polychromic Objectives for Reinforcement Learning
Machine Learning

[2509.25424] Polychromic Objectives for Reinforcement Learning

The paper introduces polychromic objectives for reinforcement learning, enhancing policy diversity and exploration in pretrained models, ...

arXiv - AI · 4 min ·
Previous Page 69 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime