AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Agents

CodeGraphContext - An MCP server that converts your codebase into a graph database

CodeGraphContext- the go to solution for graph-code indexing 🎉🎉... It's an MCP server that understands a codebase as a graph, not chunks ...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Ai Infrastructure

Who needs fancy stuff, When you can program, build, train and run 2 completely different ai agents on an i3 4GB RAM and onboard gpu chip? looool

And I know some of yall doubt - so I’ll follow up. submitted by /u/Snoo-76697 [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Ai Agents

[P] Easily provide Wandb logs as context to agents for analysis and planning.

It is frustrating to use the Wandb CLI and MCP tools with my agents. For one, the MCP tool basically floods the context window and freque...

Reddit - Machine Learning · 1 min · about 9 hours ago

All Content

Nlp

[2407.01566] A Parametric Contextual Online Learning Theory of Brokerage

This paper presents a parametric contextual online learning theory focused on brokerage, where brokers suggest trading prices based on tr...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2512.18957] Online Robust Reinforcement Learning with General Function Approximation

This paper presents an online robust reinforcement learning (DR-RL) algorithm that utilizes general function approximation, enabling robu...

arXiv - Machine Learning · 4 min · about 2 months ago

$[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting$

Llms

[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting

The paper presents a novel method, Entropy After </Think> (EAT), to optimize reasoning in LLMs by reducing unnecessary computations while...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.09135] Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

This paper presents a Continuous-Time Multi-Agent Reinforcement Learning (CT-MARL) framework that enhances value iteration using physics-...

arXiv - Machine Learning · 4 min · about 2 months ago

Nlp

[2506.14518] Two-Player Zero-Sum Games with Bandit Feedback

This paper explores two-player zero-sum games using bandit feedback, proposing algorithms that adapt existing frameworks to optimize acti...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2502.00204] Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information

This paper presents algorithms for nearly-optimal bandit learning in Stackelberg games, achieving improved regret rates and extending app...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2502.14762] Unlocking [CLS] Features for Continual Post-Training

The paper presents a novel approach to continual learning in machine learning models, introducing a parameter-efficient fine-tuning modul...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2405.11454] Gradient Testing and Estimation by Comparisons

The paper presents algorithms for gradient testing and estimation using a comparison oracle, optimizing query efficiency for smooth funct...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17654] Mine and Refine: Optimizing Graded Relevance in E-commerce Search Retrieval

The paper presents a two-stage framework called 'Mine and Refine' for optimizing graded relevance in e-commerce search retrieval, enhanci...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17086] Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

This paper explores dynamic decision-making under model misspecification, focusing on Thompson Sampling (TS) in Bayesian reinforcement le...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16914] A statistical perspective on transformers for small longitudinal cohort data

This paper presents a simplified transformer architecture tailored for small longitudinal cohort data, enhancing predictive performance w...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16819] Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

The paper presents Hybrid-Gym, a training environment designed to enhance coding agents' ability to generalize across various software en...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.16738] Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance

The paper presents SEMAS, a self-evolving multi-agent network designed for predictive maintenance in Industrial IoT, enhancing real-time ...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17646] Multi-Round Human-AI Collaboration with User-Specified Requirements

The paper discusses a framework for multi-round human-AI collaboration, emphasizing user-specified requirements to enhance decision quali...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17584] Canonicalizing Multimodal Contrastive Representation Learning

This article explores the geometric relationships between independently trained multimodal contrastive models, revealing that an orthogon...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.17497] Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

The paper presents a novel approach called Retrospective In-Context Learning (RICL) for enhancing temporal credit assignment in reinforce...

arXiv - Machine Learning · 4 min · about 2 months ago

Ai Agents

[2602.17486] Linear Convergence in Games with Delayed Feedback via Extra Prediction

This paper explores the linear convergence of the Weighted Optimistic Gradient Descent-Ascent algorithm in multi-agent games with delayed...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17477] Variational Grey-Box Dynamics Matching

The paper presents a novel grey-box method that integrates incomplete physics models into deep generative models, enabling the learning o...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17375] MDP Planning as Policy Inference

This article presents a novel approach to episodic Markov decision process (MDP) planning by framing it as Bayesian inference over polici...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

The paper presents LexiSafe, a novel offline safe reinforcement learning framework that employs a lexicographic safety-reward hierarchy t...

arXiv - Machine Learning · 3 min · about 2 months ago

Previous Page 95 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

CodeGraphContext - An MCP server that converts your codebase into a graph database

Who needs fancy stuff, When you can program, build, train and run 2 completely different ai agents on an i3 4GB RAM and onboard gpu chip? looool

[P] Easily provide Wandb logs as context to agents for analysis and planning.

All Content

[2407.01566] A Parametric Contextual Online Learning Theory of Brokerage

[2512.18957] Online Robust Reinforcement Learning with General Function Approximation

[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting

[2509.09135] Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

[2506.14518] Two-Player Zero-Sum Games with Bandit Feedback

[2502.00204] Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information

[2502.14762] Unlocking [CLS] Features for Continual Post-Training

[2405.11454] Gradient Testing and Estimation by Comparisons

[2602.17654] Mine and Refine: Optimizing Graded Relevance in E-commerce Search Retrieval

[2602.17086] Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

[2602.16914] A statistical perspective on transformers for small longitudinal cohort data

[2602.16819] Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

[2602.16738] Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance

[2602.17646] Multi-Round Human-AI Collaboration with User-Specified Requirements

[2602.17584] Canonicalizing Multimodal Contrastive Representation Learning

[2602.17497] Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

[2602.17486] Linear Convergence in Games with Delayed Feedback via Extra Prediction

[2602.17477] Variational Grey-Box Dynamics Matching

[2602.17375] MDP Planning as Policy Inference

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

Related Topics

Stay updated with AI News