AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Llms

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

Every agent I build forgets everything between sessions. I got tired of it and built brainctl. pip install brainctl, then: from agentmemo...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Why does Multi-Agent RL fail to act like a real society in Spatial Game Theory? [P] [R]

Hey everyone, I’m building a project for my university Machine Learning course called "Social network analysis using iterated game theory...

Reddit - Machine Learning · 1 min ·
AWS turns its S3 storage service into a file system for AI agents
Nlp

AWS turns its S3 storage service into a file system for AI agents

AI News - General ·

All Content

[2505.12641] Single Image Reflection Separation via Dual Prior Interaction Transformer
Machine Learning

[2505.12641] Single Image Reflection Separation via Dual Prior Interaction Transformer

This paper presents a novel approach to single image reflection separation using a Dual Prior Interaction Transformer, enhancing the extr...

arXiv - AI · 4 min ·
[2601.22323] Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning
Llms

[2601.22323] Models Under SCOPE: Scalable and Controllable Routing via Pre-hoc Reasoning

The paper presents SCOPE, a novel routing framework for language models that dynamically predicts cost and performance, enhancing efficie...

arXiv - Machine Learning · 4 min ·
[2601.20802] Reinforcement Learning via Self-Distillation
Llms

[2601.20802] Reinforcement Learning via Self-Distillation

This paper introduces Self-Distillation Policy Optimization (SDPO) for reinforcement learning, leveraging rich feedback to enhance learni...

arXiv - AI · 4 min ·
[2601.18702] From Fuzzy to Exact: The Halo Architecture for Infinite-Depth Reasoning via Rational Arithmetic
Llms

[2601.18702] From Fuzzy to Exact: The Halo Architecture for Infinite-Depth Reasoning via Rational Arithmetic

This paper introduces the Halo Architecture, a new framework for infinite-depth reasoning using rational arithmetic, aiming to enhance th...

arXiv - AI · 4 min ·
[2504.21205] SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories
Llms

[2504.21205] SecRepoBench: Benchmarking Code Agents for Secure Code Completion in Real-World Repositories

The paper presents SecRepoBench, a benchmark designed to evaluate code agents' performance in secure code completion across real-world C/...

arXiv - AI · 3 min ·
[2601.16443] Endless Terminals: Scaling RL Environments for Terminal Agents
Machine Learning

[2601.16443] Endless Terminals: Scaling RL Environments for Terminal Agents

The paper presents 'Endless Terminals', a scalable reinforcement learning (RL) environment designed for training terminal agents through ...

arXiv - Machine Learning · 4 min ·
[2504.20903] Modeling AI-Human Collaboration as a Multi-Agent Adaptation
Machine Learning

[2504.20903] Modeling AI-Human Collaboration as a Multi-Agent Adaptation

This paper explores AI-human collaboration through agent-based simulations, revealing how distinct decision-making heuristics impact perf...

arXiv - AI · 4 min ·
[2601.12415] Orthogonalized Policy Optimization:Decoupling Sampling Geometry from Optimization Geometry in RLHF
Llms

[2601.12415] Orthogonalized Policy Optimization:Decoupling Sampling Geometry from Optimization Geometry in RLHF

This paper introduces Orthogonalized Policy Optimization (OPO), a new approach in reinforcement learning that separates sampling and opti...

arXiv - Machine Learning · 4 min ·
[2601.09495] Parallelizable memory recurrent units
Machine Learning

[2601.09495] Parallelizable memory recurrent units

The paper introduces memory recurrent units (MRUs), a new family of RNNs that combine persistent memory with parallelizable computations,...

arXiv - Machine Learning · 4 min ·
[2502.20326] Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application
Robotics

[2502.20326] Deep Reinforcement Learning based Autonomous Decision-Making for Cooperative UAVs: A Search and Rescue Real World Application

This paper presents a novel framework for autonomous decision-making in UAVs during search-and-rescue operations, demonstrating effective...

arXiv - AI · 4 min ·
[2502.16730] RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents
Llms

[2502.16730] RapidPen: Fully Automated IP-to-Shell Penetration Testing with LLM-based Agents

RapidPen is a novel automated penetration testing framework that utilizes large language models to autonomously exploit vulnerabilities, ...

arXiv - AI · 4 min ·
[2502.12581] The Majority Vote Paradigm Shift: When Popular Meets Optimal
Machine Learning

[2502.12581] The Majority Vote Paradigm Shift: When Popular Meets Optimal

The article explores the Majority Vote (MV) method for data labeling, analyzing its optimality in aggregating labels from multiple annota...

arXiv - Machine Learning · 4 min ·
[2501.07575] Dataset Distillation via Committee Voting
Machine Learning

[2501.07575] Dataset Distillation via Committee Voting

The paper presents a novel method for dataset distillation called Committee Voting for Dataset Distillation (CV-DD), which enhances data ...

arXiv - AI · 4 min ·
[2512.00499] ESPO: Entropy Importance Sampling Policy Optimization
Llms

[2512.00499] ESPO: Entropy Importance Sampling Policy Optimization

The paper presents ESPO, a novel framework for optimizing reinforcement learning in large language models, addressing training stability ...

arXiv - AI · 4 min ·
[2412.00686] LVLM-COUNT: Enhancing the Counting Ability of Large Vision-Language Models
Llms

[2412.00686] LVLM-COUNT: Enhancing the Counting Ability of Large Vision-Language Models

The paper presents LVLM-COUNT, a method to enhance the counting ability of large vision-language models (LVLMs) by using a divide-and-con...

arXiv - AI · 4 min ·
[2511.07833] MURPHY: Multi-Turn GRPO for Self Correcting Code Generation
Llms

[2511.07833] MURPHY: Multi-Turn GRPO for Self Correcting Code Generation

The paper presents MURPHY, a multi-turn reinforcement learning framework that enhances code generation by incorporating execution feedbac...

arXiv - AI · 3 min ·
[2511.06781] On the Mechanisms of Collaborative Learning in VAE Recommenders
Machine Learning

[2511.06781] On the Mechanisms of Collaborative Learning in VAE Recommenders

This paper explores the mechanisms of collaborative learning in Variational Autoencoder (VAE) recommenders, highlighting the role of late...

arXiv - AI · 4 min ·
[2510.14581] Model-agnostic Selective Labeling with Provable Statistical Guarantees
Machine Learning

[2510.14581] Model-agnostic Selective Labeling with Provable Statistical Guarantees

The paper presents 'Conformal Labeling', a model-agnostic method that ensures high-quality AI-generated labels by controlling the false d...

arXiv - AI · 4 min ·
[2406.04955] Experimental Evaluation of ROS-Causal in Real-World Human-Robot Spatial Interaction Scenarios
Machine Learning

[2406.04955] Experimental Evaluation of ROS-Causal in Real-World Human-Robot Spatial Interaction Scenarios

This article presents an experimental evaluation of ROS-Causal, a framework for causal discovery in human-robot spatial interactions, dem...

arXiv - AI · 4 min ·
[2510.10854] Discrete State Diffusion Models: A Sample Complexity Perspective
Machine Learning

[2510.10854] Discrete State Diffusion Models: A Sample Complexity Perspective

This article presents a theoretical framework for discrete-state diffusion models, offering the first sample complexity bounds and insigh...

arXiv - AI · 3 min ·
Previous Page 131 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime