AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[2511.06448] When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

Abstract page for arXiv paper 2511.06448: When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Plat...

arXiv - AI · 4 min · 28 minutes ago

Ai Agents

[2510.20728] Co-Designing Quantum Codes with Transversal Diagonal Gates via Multi-Agent Systems

Abstract page for arXiv paper 2510.20728: Co-Designing Quantum Codes with Transversal Diagonal Gates via Multi-Agent Systems

arXiv - AI · 4 min · 28 minutes ago

Llms

[2510.06800] FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

Abstract page for arXiv paper 2510.06800: FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipe...

arXiv - AI · 4 min · 28 minutes ago

All Content

Machine Learning

[2602.17086] Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

This paper explores dynamic decision-making under model misspecification, focusing on Thompson Sampling (TS) in Bayesian reinforcement le...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16914] A statistical perspective on transformers for small longitudinal cohort data

This paper presents a simplified transformer architecture tailored for small longitudinal cohort data, enhancing predictive performance w...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16819] Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

The paper presents Hybrid-Gym, a training environment designed to enhance coding agents' ability to generalize across various software en...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.16738] Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance

The paper presents SEMAS, a self-evolving multi-agent network designed for predictive maintenance in Industrial IoT, enhancing real-time ...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17646] Multi-Round Human-AI Collaboration with User-Specified Requirements

The paper discusses a framework for multi-round human-AI collaboration, emphasizing user-specified requirements to enhance decision quali...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17584] Canonicalizing Multimodal Contrastive Representation Learning

This article explores the geometric relationships between independently trained multimodal contrastive models, revealing that an orthogon...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.17497] Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

The paper presents a novel approach called Retrospective In-Context Learning (RICL) for enhancing temporal credit assignment in reinforce...

arXiv - Machine Learning · 4 min · about 2 months ago

Ai Agents

[2602.17486] Linear Convergence in Games with Delayed Feedback via Extra Prediction

This paper explores the linear convergence of the Weighted Optimistic Gradient Descent-Ascent algorithm in multi-agent games with delayed...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17477] Variational Grey-Box Dynamics Matching

The paper presents a novel grey-box method that integrates incomplete physics models into deep generative models, enabling the learning o...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17375] MDP Planning as Policy Inference

This article presents a novel approach to episodic Markov decision process (MDP) planning by framing it as Bayesian inference over polici...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

The paper presents LexiSafe, a novel offline safe reinforcement learning framework that employs a lexicographic safety-reward hierarchy t...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17276] RLGT: A reinforcement learning framework for extremal graph theory

The paper introduces RLGT, a novel reinforcement learning framework designed for extremal graph theory, enhancing the application of RL i...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17144] When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

This article discusses the challenges of multi-expert learning in machine learning, highlighting how underfitting can occur when multiple...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17103] Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

This paper explores online learning models using improving agents, focusing on multiclass setups, budgeted agents, and bandit learners, e...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17068] Spatio-temporal dual-stage hypergraph MARL for human-centric multimodal corridor traffic signal control

This paper introduces STDSH-MARL, a novel framework for human-centric traffic signal control that enhances multimodal transportation effi...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.17025] WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning

The paper introduces WS-GRPO, a method for improving rollout efficiency in language model training by providing correctness-aware guidanc...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17013] Malliavin Calculus as Stochastic Backpropogation

This paper establishes a connection between pathwise and score-function gradient estimators in stochastic backpropagation, introducing a ...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17009] Action-Graph Policies: Learning Action Co-dependencies in Multi-Agent Reinforcement Learning

The paper introduces Action Graph Policies (AGP) for multi-agent reinforcement learning, emphasizing the importance of action co-dependen...

arXiv - Machine Learning · 3 min · about 2 months ago

Ai Agents

[2602.16965] Multi-Agent Lipschitz Bandits

The paper presents a novel approach to the multi-agent Lipschitz bandit problem, proposing a communication-free policy that maximizes col...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.16954] Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

The paper presents Neuro-Symbolic Graph Generative Modeling (NSGGM), a framework that enhances molecule generation by integrating symboli...

arXiv - Machine Learning · 3 min · about 2 months ago

Previous Page 99 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

[2511.06448] When AI Agents Collude Online: Financial Fraud Risks by Collaborative LLM Agents on Social Platforms

[2510.20728] Co-Designing Quantum Codes with Transversal Diagonal Gates via Multi-Agent Systems

[2510.06800] FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

All Content

[2602.17086] Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

[2602.16914] A statistical perspective on transformers for small longitudinal cohort data

[2602.16819] Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

[2602.16738] Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance

[2602.17646] Multi-Round Human-AI Collaboration with User-Specified Requirements

[2602.17584] Canonicalizing Multimodal Contrastive Representation Learning

[2602.17497] Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

[2602.17486] Linear Convergence in Games with Delayed Feedback via Extra Prediction

[2602.17477] Variational Grey-Box Dynamics Matching

[2602.17375] MDP Planning as Policy Inference

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

[2602.17276] RLGT: A reinforcement learning framework for extremal graph theory

[2602.17144] When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

[2602.17103] Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

[2602.17068] Spatio-temporal dual-stage hypergraph MARL for human-centric multimodal corridor traffic signal control

[2602.17025] WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning

[2602.17013] Malliavin Calculus as Stochastic Backpropogation

[2602.17009] Action-Graph Policies: Learning Action Co-dependencies in Multi-Agent Reinforcement Learning

[2602.16965] Multi-Agent Lipschitz Bandits

[2602.16954] Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

Related Topics

Stay updated with AI News