AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Llms

Been building a multi-agent framework in public for 5 weeks, its been a Journey.

I've been building this repo public since day one, roughly 5 weeks now with Claude Code. Here's where it's at. Feels good to be so close....

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

"There's a new generation of empirical deep learning researchers, hacking away at whatever seems trendy, blowing with the wind" [D]

Saw this on X. I too am struggling with the term post agentic ai just posting here for further discussion. submitted by /u/elnino2023 [li...

Reddit - Machine Learning · 1 min ·
Ai Infrastructure

Alibaba-linked AI agent hijacked GPUs for unauthorized crypto mining, researchers say

How do people make sense of this? submitted by /u/stvlsn [link] [comments]

Reddit - Artificial Intelligence · 1 min ·

All Content

[2512.18080] From Prompt to Product: A Human-Centered Benchmark of Agentic App Generation Systems
Nlp

[2512.18080] From Prompt to Product: A Human-Centered Benchmark of Agentic App Generation Systems

This paper introduces a human-centered benchmark for evaluating agentic app generation systems, comparing platforms like Replit, Bolt, an...

arXiv - AI · 4 min ·
[2511.10453] Reasoning about Intent for Ambiguous Requests
Llms

[2511.10453] Reasoning about Intent for Ambiguous Requests

This paper explores how large language models can better handle ambiguous requests by generating multiple interpretation-answer pairs, en...

arXiv - AI · 3 min ·
[2510.24803] MASPRM: Multi-Agent System Process Reward Model
Machine Learning

[2510.24803] MASPRM: Multi-Agent System Process Reward Model

The MASPRM paper introduces a novel Multi-Agent System Process Reward Model that enhances performance during inference by guiding search ...

arXiv - AI · 3 min ·
[2509.14832] Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization
Machine Learning

[2509.14832] Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization

The paper presents a Diffusion Scenario Tree (DST) framework for multivariate time series prediction and multistage stochastic optimizati...

arXiv - Machine Learning · 4 min ·
[2508.12685] ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction
Llms

[2508.12685] ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

ToolACE-MT introduces a non-autoregressive framework for generating high-quality multi-turn dialogues in agentic interactions, enhancing ...

arXiv - Machine Learning · 3 min ·
[2508.17742] EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models
Llms

[2508.17742] EEG-FM-Bench: A Comprehensive Benchmark for the Systematic Evaluation of EEG Foundation Models

The paper presents EEG-FM-Bench, a standardized benchmark for evaluating EEG foundation models, addressing inconsistencies in current eva...

arXiv - AI · 4 min ·
[2508.08275] MLLM-CTBench: A Benchmark for Continual Instruction Tuning with Reasoning Process Diagnosis
Llms

[2508.08275] MLLM-CTBench: A Benchmark for Continual Instruction Tuning with Reasoning Process Diagnosis

The paper presents MLLM-CTBench, a benchmark for continual instruction tuning of multimodal large language models, addressing the need fo...

arXiv - AI · 4 min ·
[2508.05004] R-Zero: Self-Evolving Reasoning LLM from Zero Data
Llms

[2508.05004] R-Zero: Self-Evolving Reasoning LLM from Zero Data

The article presents R-Zero, a self-evolving reasoning LLM that autonomously generates training data, improving AI capabilities without h...

arXiv - Machine Learning · 4 min ·
[2507.16696] FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation
Llms

[2507.16696] FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation

FISHER is a proposed foundation model aimed at improving the analysis of multi-modal industrial signals, addressing the challenges posed ...

arXiv - Machine Learning · 4 min ·
[2507.12108] Multimodal Coordinated Online Behavior: Trade-offs and Strategies
Robotics

[2507.12108] Multimodal Coordinated Online Behavior: Trade-offs and Strategies

This paper explores multimodal coordinated online behavior, analyzing trade-offs between different integration strategies and their effec...

arXiv - Machine Learning · 4 min ·
[2507.02310] Holistic Continual Learning under Concept Drift with Adaptive Memory Realignment
Ai Safety

[2507.02310] Holistic Continual Learning under Concept Drift with Adaptive Memory Realignment

This paper presents a novel framework for continual learning that addresses concept drift through Adaptive Memory Realignment (AMR), enha...

arXiv - Machine Learning · 4 min ·
[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models
Llms

[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models

This article introduces the Haerae Evaluation Toolkit (HRET), a unified framework for evaluating the capabilities of Korean language mode...

arXiv - AI · 4 min ·
[2412.07909] Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning
Machine Learning

[2412.07909] Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning

This paper explores the modality gap in contrastive multimodal learning, analyzing its causes and proposing methods to mitigate it for im...

arXiv - Machine Learning · 4 min ·
[2602.04634] WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning
Llms

[2602.04634] WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

The paper introduces WideSeek-R1, a multi-agent reinforcement learning framework aimed at improving broad information seeking by enhancin...

arXiv - Machine Learning · 4 min ·
[2602.02709] ATLAS : Adaptive Self-Evolutionary Research Agent with Task-Distributed Multi-LLM Supporters
Llms

[2602.02709] ATLAS : Adaptive Self-Evolutionary Research Agent with Task-Distributed Multi-LLM Supporters

The paper presents ATLAS, an adaptive self-evolutionary research agent that utilizes task-distributed multi-LLM supporters to enhance per...

arXiv - AI · 3 min ·
[2601.10485] Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge
Nlp

[2601.10485] Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge

The paper proposes a novel approach for enhancing domain-specific knowledge graphs (DKGs) by integrating general knowledge graphs (GKGs) ...

arXiv - AI · 4 min ·
[2601.00004] Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study
Llms

[2601.00004] Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study

This study explores the use of fine-tuned large language models for automated depression screening in Nigerian Pidgin English, addressing...

arXiv - Machine Learning · 4 min ·
[2512.19135] Understanding Chain-of-Thought in Large Language Models via Topological Data Analysis
Llms

[2512.19135] Understanding Chain-of-Thought in Large Language Models via Topological Data Analysis

This paper explores the structural analysis of reasoning chains in large language models (LLMs) using Topological Data Analysis (TDA), re...

arXiv - AI · 4 min ·
[2512.12182] TA-KAND: Two-stage Attention Triple Enhancement and U-KAN based Diffusion For Few-shot Knowledge Graph Completion
Generative Ai

[2512.12182] TA-KAND: Two-stage Attention Triple Enhancement and U-KAN based Diffusion For Few-shot Knowledge Graph Completion

The paper presents TA-KAND, a novel framework for few-shot knowledge graph completion that employs a two-stage attention mechanism and U-...

arXiv - Machine Learning · 3 min ·
[2510.23883] Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges
Llms

[2510.23883] Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

This article explores the security implications of agentic AI systems, detailing specific threats, defense strategies, and evaluation met...

arXiv - AI · 3 min ·
Previous Page 150 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime