AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Agents

NeuBird AI Raises $19.3 Million To Scale Agentic AI

NeuBird AI, a San Francisco-based artificial intelligence company, has raised $19.3 million in funding to scale its agentic AI technology...

AI News - General · 4 min · about 1 hour ago

Ai Agents

CodeGraphContext - An MCP server that converts your codebase into a graph database

CodeGraphContext- the go to solution for graph-code indexing 🎉🎉... It's an MCP server that understands a codebase as a graph, not chunks ...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Ai Infrastructure

Who needs fancy stuff, When you can program, build, train and run 2 completely different ai agents on an i3 4GB RAM and onboard gpu chip? looool

And I know some of yall doubt - so I’ll follow up. submitted by /u/Snoo-76697 [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

All Content

Machine Learning

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

The paper presents LexiSafe, a novel offline safe reinforcement learning framework that employs a lexicographic safety-reward hierarchy t...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17276] RLGT: A reinforcement learning framework for extremal graph theory

The paper introduces RLGT, a novel reinforcement learning framework designed for extremal graph theory, enhancing the application of RL i...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17144] When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

This article discusses the challenges of multi-expert learning in machine learning, highlighting how underfitting can occur when multiple...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17103] Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

This paper explores online learning models using improving agents, focusing on multiclass setups, budgeted agents, and bandit learners, e...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17068] Spatio-temporal dual-stage hypergraph MARL for human-centric multimodal corridor traffic signal control

This paper introduces STDSH-MARL, a novel framework for human-centric traffic signal control that enhances multimodal transportation effi...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.17025] WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning

The paper introduces WS-GRPO, a method for improving rollout efficiency in language model training by providing correctness-aware guidanc...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17013] Malliavin Calculus as Stochastic Backpropogation

This paper establishes a connection between pathwise and score-function gradient estimators in stochastic backpropagation, introducing a ...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17009] Action-Graph Policies: Learning Action Co-dependencies in Multi-Agent Reinforcement Learning

The paper introduces Action Graph Policies (AGP) for multi-agent reinforcement learning, emphasizing the importance of action co-dependen...

arXiv - Machine Learning · 3 min · about 2 months ago

Ai Agents

[2602.16965] Multi-Agent Lipschitz Bandits

The paper presents a novel approach to the multi-agent Lipschitz bandit problem, proposing a communication-free policy that maximizes col...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.16954] Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

The paper presents Neuro-Symbolic Graph Generative Modeling (NSGGM), a framework that enhances molecule generation by integrating symboli...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.16849] On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

This paper analyzes how two-layer neural networks learn to solve the modular addition task, providing insights into feature learning, tra...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16837] A Residual-Aware Theory of Position Bias in Transformers

This paper presents a residual-aware theory explaining the position bias in Transformers, revealing how residual connections prevent atte...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.16823] Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

This article presents a novel approach to automated circuit discovery in neural networks, emphasizing provable guarantees for robustness ...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16793] Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

The paper presents a novel inference pipeline that leverages off-the-shelf models to solve International Mathematical Olympiad problems e...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.16787] Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency

This paper introduces Double Counterfactual Consistency (DCC), a method for evaluating and enhancing causal reasoning in large language m...

arXiv - Machine Learning · 3 min · about 2 months ago

Robotics

[2602.11337] MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation

MolmoSpaces introduces a large-scale open ecosystem designed for benchmarking robot navigation and manipulation, featuring over 230k dive...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.07666] SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

This paper analyzes DARPA's AI Cyber Challenge (AIxCC), focusing on competition design, architectural approaches of finalists, and key le...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.06355] Di3PO - Diptych Diffusion DPO for Targeted Improvements in Image Generation

The paper presents Di3PO, a novel method for improving image generation in text-to-image diffusion models by efficiently creating targete...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.03972] Fixed Budget is No Harder Than Fixed Confidence in Best-Arm Identification up to Logarithmic Factors

This paper explores the relationship between fixed-budget and fixed-confidence settings in best-arm identification, demonstrating that th...

arXiv - Machine Learning · 4 min · about 2 months ago

Generative Ai

[2601.08697] Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students

This study audits the collaboration between online graduate CS students and AI, exploring preferences for automation in academic tasks an...

arXiv - AI · 3 min · about 2 months ago

Previous Page 96 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

NeuBird AI Raises $19.3 Million To Scale Agentic AI

CodeGraphContext - An MCP server that converts your codebase into a graph database

Who needs fancy stuff, When you can program, build, train and run 2 completely different ai agents on an i3 4GB RAM and onboard gpu chip? looool

All Content

[2602.17312] LexiSafe: Offline Safe Reinforcement Learning with Lexicographic Safety-Reward Hierarchy

[2602.17276] RLGT: A reinforcement learning framework for extremal graph theory

[2602.17144] When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

[2602.17103] Online Learning with Improving Agents: Multiclass, Budgeted Agents and Bandit Learners

[2602.17068] Spatio-temporal dual-stage hypergraph MARL for human-centric multimodal corridor traffic signal control

[2602.17025] WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning

[2602.17013] Malliavin Calculus as Stochastic Backpropogation

[2602.17009] Action-Graph Policies: Learning Action Co-dependencies in Multi-Agent Reinforcement Learning

[2602.16965] Multi-Agent Lipschitz Bandits

[2602.16954] Neural Proposals, Symbolic Guarantees: Neuro-Symbolic Graph Generation with Hard Constraints

[2602.16849] On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

[2602.16837] A Residual-Aware Theory of Position Bias in Transformers

[2602.16823] Formal Mechanistic Interpretability: Automated Circuit Discovery with Provable Guarantees

[2602.16793] Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

[2602.16787] Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency

[2602.11337] MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation

[2602.07666] SoK: DARPA's AI Cyber Challenge (AIxCC): Competition Design, Architectures, and Lessons Learned

[2602.06355] Di3PO - Diptych Diffusion DPO for Targeted Improvements in Image Generation

[2602.03972] Fixed Budget is No Harder Than Fixed Confidence in Best-Arm Identification up to Logarithmic Factors

[2601.08697] Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students

Related Topics

Stay updated with AI News