AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Ai Agents

CodeGraphContext - An MCP server that converts your codebase into a graph database

CodeGraphContext- the go to solution for graph-code indexing 🎉🎉... It's an MCP server that understands a codebase as a graph, not chunks ...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

Who needs fancy stuff, When you can program, build, train and run 2 completely different ai agents on an i3 4GB RAM and onboard gpu chip? looool

And I know some of yall doubt - so I’ll follow up. submitted by /u/Snoo-76697 [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Ai Agents

[P] Easily provide Wandb logs as context to agents for analysis and planning.

It is frustrating to use the Wandb CLI and MCP tools with my agents. For one, the MCP tool basically floods the context window and freque...

Reddit - Machine Learning · 1 min ·

All Content

[2407.01566] A Parametric Contextual Online Learning Theory of Brokerage
Nlp

[2407.01566] A Parametric Contextual Online Learning Theory of Brokerage

This paper presents a parametric contextual online learning theory focused on brokerage, where brokers suggest trading prices based on tr...

arXiv - Machine Learning · 3 min ·
[2512.18957] Online Robust Reinforcement Learning with General Function Approximation
Machine Learning

[2512.18957] Online Robust Reinforcement Learning with General Function Approximation

This paper presents an online robust reinforcement learning (DR-RL) algorithm that utilizes general function approximation, enabling robu...

arXiv - Machine Learning · 4 min ·
[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting
Llms

[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting

The paper presents a novel method, Entropy After </Think> (EAT), to optimize reasoning in LLMs by reducing unnecessary computations while...

arXiv - Machine Learning · 4 min ·
[2509.09135] Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning
Llms

[2509.09135] Continuous-Time Value Iteration for Multi-Agent Reinforcement Learning

This paper presents a Continuous-Time Multi-Agent Reinforcement Learning (CT-MARL) framework that enhances value iteration using physics-...

arXiv - Machine Learning · 4 min ·
[2506.14518] Two-Player Zero-Sum Games with Bandit Feedback
Nlp

[2506.14518] Two-Player Zero-Sum Games with Bandit Feedback

This paper explores two-player zero-sum games using bandit feedback, proposing algorithms that adapt existing frameworks to optimize acti...

arXiv - Machine Learning · 4 min ·
[2502.00204] Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information
Machine Learning

[2502.00204] Nearly-Optimal Bandit Learning in Stackelberg Games with Side Information

This paper presents algorithms for nearly-optimal bandit learning in Stackelberg games, achieving improved regret rates and extending app...

arXiv - Machine Learning · 4 min ·
[2502.14762] Unlocking [CLS] Features for Continual Post-Training
Llms

[2502.14762] Unlocking [CLS] Features for Continual Post-Training

The paper presents a novel approach to continual learning in machine learning models, introducing a parameter-efficient fine-tuning modul...

arXiv - Machine Learning · 4 min ·
[2405.11454] Gradient Testing and Estimation by Comparisons
Machine Learning

[2405.11454] Gradient Testing and Estimation by Comparisons

The paper presents algorithms for gradient testing and estimation using a comparison oracle, optimizing query efficiency for smooth funct...

arXiv - Machine Learning · 3 min ·
[2602.17654] Mine and Refine: Optimizing Graded Relevance in E-commerce Search Retrieval
Machine Learning

[2602.17654] Mine and Refine: Optimizing Graded Relevance in E-commerce Search Retrieval

The paper presents a two-stage framework called 'Mine and Refine' for optimizing graded relevance in e-commerce search retrieval, enhanci...

arXiv - Machine Learning · 4 min ·
[2602.17086] Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach
Machine Learning

[2602.17086] Dynamic Decision-Making under Model Misspecification: A Stochastic Stability Approach

This paper explores dynamic decision-making under model misspecification, focusing on Thompson Sampling (TS) in Bayesian reinforcement le...

arXiv - Machine Learning · 4 min ·
[2602.16914] A statistical perspective on transformers for small longitudinal cohort data
Machine Learning

[2602.16914] A statistical perspective on transformers for small longitudinal cohort data

This paper presents a simplified transformer architecture tailored for small longitudinal cohort data, enhancing predictive performance w...

arXiv - Machine Learning · 4 min ·
[2602.16819] Hybrid-Gym: Training Coding Agents to Generalize Across Tasks
Machine Learning

[2602.16819] Hybrid-Gym: Training Coding Agents to Generalize Across Tasks

The paper presents Hybrid-Gym, a training environment designed to enhance coding agents' ability to generalize across various software en...

arXiv - Machine Learning · 4 min ·
[2602.16738] Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance
Llms

[2602.16738] Self-Evolving Multi-Agent Network for Industrial IoT Predictive Maintenance

The paper presents SEMAS, a self-evolving multi-agent network designed for predictive maintenance in Industrial IoT, enhancing real-time ...

arXiv - Machine Learning · 4 min ·
[2602.17646] Multi-Round Human-AI Collaboration with User-Specified Requirements
Machine Learning

[2602.17646] Multi-Round Human-AI Collaboration with User-Specified Requirements

The paper discusses a framework for multi-round human-AI collaboration, emphasizing user-specified requirements to enhance decision quali...

arXiv - Machine Learning · 3 min ·
[2602.17584] Canonicalizing Multimodal Contrastive Representation Learning
Machine Learning

[2602.17584] Canonicalizing Multimodal Contrastive Representation Learning

This article explores the geometric relationships between independently trained multimodal contrastive models, revealing that an orthogon...

arXiv - Machine Learning · 4 min ·
[2602.17497] Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models
Llms

[2602.17497] Retrospective In-Context Learning for Temporal Credit Assignment with Large Language Models

The paper presents a novel approach called Retrospective In-Context Learning (RICL) for enhancing temporal credit assignment in reinforce...

arXiv - Machine Learning · 4 min ·
[2602.17486] Linear Convergence in Games with Delayed Feedback via Extra Prediction
Ai Agents

[2602.17486] Linear Convergence in Games with Delayed Feedback via Extra Prediction

This paper explores the linear convergence of the Weighted Optimistic Gradient Descent-Ascent algorithm in multi-agent games with delayed...

arXiv - Machine Learning · 4 min ·
[2602.17477] Variational Grey-Box Dynamics Matching
Machine Learning

[2602.17477] Variational Grey-Box Dynamics Matching

The paper presents a novel grey-box method that integrates incomplete physics models into deep generative models, enabling the learning o...

arXiv - Machine Learning · 4 min ·
[2602.17375] MDP Planning as Policy Inference
Machine Learning

[2602.17375] MDP Planning as Policy Inference

This article presents a novel approach to episodic Markov decision process (MDP) planning by framing it as Bayesian inference over polici...

arXiv - Machine Learning · 3 min ·
[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy
Machine Learning

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

The paper presents LexiSafe, a novel offline safe reinforcement learning framework that employs a lexicographic safety-reward hierarchy t...

arXiv - Machine Learning · 3 min ·
Previous Page 95 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime