AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Llms

Been building a multi-agent framework in public for 5 weeks, its been a Journey.

I've been building this repo public since day one, roughly 5 weeks now with Claude Code. Here's where it's at. Feels good to be so close....

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

"There's a new generation of empirical deep learning researchers, hacking away at whatever seems trendy, blowing with the wind" [D]

Saw this on X. I too am struggling with the term post agentic ai just posting here for further discussion. submitted by /u/elnino2023 [li...

Reddit - Machine Learning · 1 min ·
Ai Infrastructure

Alibaba-linked AI agent hijacked GPUs for unauthorized crypto mining, researchers say

How do people make sense of this? submitted by /u/stvlsn [link] [comments]

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.13052] Quantization-Aware Collaborative Inference for Large Embodied AI Models
Machine Learning

[2602.13052] Quantization-Aware Collaborative Inference for Large Embodied AI Models

This paper explores quantization-aware collaborative inference for large embodied AI models, addressing challenges in resource-limited en...

arXiv - Machine Learning · 3 min ·
[2602.12753] Hierarchical Successor Representation for Robust Transfer
Machine Learning

[2602.12753] Hierarchical Successor Representation for Robust Transfer

The paper introduces the Hierarchical Successor Representation (HSR), addressing limitations of classical successor representation in dyn...

arXiv - Machine Learning · 3 min ·
[2602.12714] ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning
Llms

[2602.12714] ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning

The paper introduces ADEPT, a novel framework for emotion recognition that enhances accuracy by integrating acoustic evidence and multi-t...

arXiv - Machine Learning · 4 min ·
[2602.12636] Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL
Machine Learning

[2602.12636] Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL

This paper introduces the Dual-Granularity Contrastive Reward framework, which enhances sample efficiency in reinforcement learning (RL) ...

arXiv - Machine Learning · 4 min ·
[2602.12529] Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models
Machine Learning

[2602.12529] Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models

Flow-Factory presents a unified framework for reinforcement learning in flow-matching models, addressing fragmentation and complexity in ...

arXiv - Machine Learning · 3 min ·
[2602.12520] Multi-Agent Model-Based Reinforcement Learning with Joint State-Action Learned Embeddings
Machine Learning

[2602.12520] Multi-Agent Model-Based Reinforcement Learning with Joint State-Action Learned Embeddings

This paper presents a novel framework for multi-agent model-based reinforcement learning, integrating joint state-action representation l...

arXiv - Machine Learning · 3 min ·
[2602.12499] A Theoretical Analysis of Mamba's Training Dynamics: Filtering Relevant Features for Generalization in State Space Models
Machine Learning

[2602.12499] A Theoretical Analysis of Mamba's Training Dynamics: Filtering Relevant Features for Generalization in State Space Models

This article presents a theoretical analysis of Mamba's training dynamics, focusing on feature selection in state space models and their ...

arXiv - Machine Learning · 4 min ·
[2602.12338] Wireless TokenCom: RL-Based Tokenizer Agreement for Multi-User Wireless Token Communications
Machine Learning

[2602.12338] Wireless TokenCom: RL-Based Tokenizer Agreement for Multi-User Wireless Token Communications

The paper presents Wireless TokenCom, a novel framework utilizing reinforcement learning for tokenizer agreement in multi-user wireless c...

arXiv - Machine Learning · 4 min ·
[2602.12089] Choose Your Agent: Tradeoffs in Adopting AI Advisors, Coaches, and Delegates in Multi-Party Negotiation
Llms

[2602.12089] Choose Your Agent: Tradeoffs in Adopting AI Advisors, Coaches, and Delegates in Multi-Party Negotiation

This study explores the tradeoffs in using AI agents in multi-party negotiations, revealing a preference-performance misalignment among u...

arXiv - AI · 4 min ·
[2602.10496] Low-Dimensional Execution Manifolds in Transformer Learning Dynamics: Evidence from Modular Arithmetic Tasks
Machine Learning

[2602.10496] Low-Dimensional Execution Manifolds in Transformer Learning Dynamics: Evidence from Modular Arithmetic Tasks

This paper explores the geometric structure of learning dynamics in transformer models, revealing that training trajectories collapse ont...

arXiv - Machine Learning · 4 min ·
[2602.10915] Blind Gods and Broken Screens: Architecting a Secure, Intent-Centric Mobile Agent Operating System
Llms

[2602.10915] Blind Gods and Broken Screens: Architecting a Secure, Intent-Centric Mobile Agent Operating System

The paper presents Aura, a secure mobile agent operating system designed to address vulnerabilities in current app-centric models by impl...

arXiv - AI · 4 min ·
[2602.10234] Transforming Policy-Car Swerving for Mitigating Stop-and-Go Traffic Waves: A Practice-Oriented Jam-Absorption Driving Strategy
Ai Agents

[2602.10234] Transforming Policy-Car Swerving for Mitigating Stop-and-Go Traffic Waves: A Practice-Oriented Jam-Absorption Driving Strategy

This article presents a novel driving strategy to mitigate stop-and-go traffic waves using a jam-absorption technique inspired by police-...

arXiv - AI · 4 min ·
[2602.10168] EVA: Towards a universal model of the immune system
Llms

[2602.10168] EVA: Towards a universal model of the immune system

The paper introduces EVA, a universal multimodal foundation model for immunology that integrates diverse biological data to enhance drug ...

arXiv - Machine Learning · 4 min ·
[2602.09394] The Critical Horizon: Inspection Design Principles for Multi-Stage Operations and Deep Reasoning
Machine Learning

[2602.09394] The Critical Horizon: Inspection Design Principles for Multi-Stage Operations and Deep Reasoning

This article presents an information-theoretic analysis of credit assignment in multi-stage operations, highlighting the challenges of at...

arXiv - Machine Learning · 4 min ·
[2602.08543] GISA: A Benchmark for General Information-Seeking Assistant
Llms

[2602.08543] GISA: A Benchmark for General Information-Seeking Assistant

The paper introduces GISA, a benchmark designed for evaluating General Information-Seeking Assistants, addressing limitations in existing...

arXiv - AI · 4 min ·
[2602.05687] Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction
Llms

[2602.05687] Exploring AI-Augmented Sensemaking of Patient-Generated Health Data: A Mixed-Method Study with Healthcare Professionals in Cardiac Risk Reduction

This study investigates how AI, specifically large language models, can enhance the understanding and use of patient-generated health dat...

arXiv - AI · 4 min ·
[2602.00737] Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization
Machine Learning

[2602.00737] Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization

This article presents a novel framework called Pareto-Conditioned Diffusion (PCD) for offline multi-objective optimization, addressing ch...

arXiv - Machine Learning · 3 min ·
[2601.16824] Privacy in Human-AI Romantic Relationships: Concerns, Boundaries, and Agency
Llms

[2601.16824] Privacy in Human-AI Romantic Relationships: Concerns, Boundaries, and Agency

This article explores privacy concerns in human-AI romantic relationships, analyzing user experiences and perceptions across different re...

arXiv - AI · 4 min ·
[2601.15673] Enhancing guidance for missing data in diffusion-based sequential recommendation
Generative Ai

[2601.15673] Enhancing guidance for missing data in diffusion-based sequential recommendation

This paper presents the Counterfactual Attention Regulation Diffusion model (CARD) to improve sequential recommendation systems by addres...

arXiv - AI · 4 min ·
[2601.09605] Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets
Nlp

[2601.09605] Sim2real Image Translation Enables Viewpoint-Robust Policies from Fixed-Camera Datasets

The paper presents MANGO, a novel image translation method that enhances viewpoint robustness in robot manipulation policies using fixed-...

arXiv - AI · 4 min ·
Previous Page 149 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime