AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Been building a multi-agent framework in public for 5 weeks, its been a Journey.

I've been building this repo public since day one, roughly 5 weeks now with Claude Code. Here's where it's at. Feels good to be so close....

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

Machine Learning

"There's a new generation of empirical deep learning researchers, hacking away at whatever seems trendy, blowing with the wind" [D]

Saw this on X. I too am struggling with the term post agentic ai just posting here for further discussion. submitted by /u/elnino2023 [li...

Reddit - Machine Learning · 1 min · about 11 hours ago

Ai Infrastructure

Alibaba-linked AI agent hijacked GPUs for unauthorized crypto mining, researchers say

How do people make sense of this? submitted by /u/stvlsn [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 16 hours ago

All Content

Machine Learning

[2602.12963] Information-theoretic analysis of world models in optimal reward maximizers

This paper presents an information-theoretic analysis of world models in optimal reward maximizers, quantifying the information conveyed ...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12876] BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents

BrowseComp-$V^3$ introduces a new benchmark for evaluating multimodal browsing agents, focusing on complex reasoning across visual and te...

arXiv - AI · 4 min · about 2 months ago

Ai Agents

[2602.12852] WebClipper: Efficient Evolution of Web Agents with Graph-based Trajectory Pruning

WebClipper introduces a novel framework for optimizing web agent trajectories through graph-based pruning, enhancing search efficiency an...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12670] SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

The paper introduces SkillsBench, a benchmark assessing the effectiveness of agent skills across 86 tasks in 11 domains, revealing signif...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12662] Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents

This paper introduces CogRouter, a framework for large language models (LLMs) that enables dynamic adaptation of cognitive depth, enhanci...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12631] AI Agents for Inventory Control: Human-LLM-OR Complementarity

This paper explores the integration of AI agents, particularly large language models (LLMs), with traditional operations research (OR) me...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.12617] GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

GeoAgent introduces a novel model for geolocation tasks, enhancing AI's reasoning capabilities with geographic characteristics and outper...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12566] To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

This paper explores the effectiveness of multi-domain reinforcement learning for large language models, comparing mixed multi-task traini...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12419] Intent-Driven Smart Manufacturing Integrating Knowledge Graphs and Large Language Models

This article discusses a framework that integrates Large Language Models and Knowledge Graphs to enhance intent-driven interactions in sm...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.12544] Scaling Web Agent Training through Automatic Data Generation and Fine-grained Evaluation

This paper presents a scalable pipeline for generating high-quality training data for web agents, introducing a novel evaluation framewor...

arXiv - AI · 3 min · about 2 months ago

Ai Safety

[2602.12316] GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory

GT-HarmBench introduces a benchmark for evaluating AI safety risks in multi-agent environments, highlighting significant reliability gaps...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.12389] Evolving Beyond Snapshots: Harmonizing Structure and Sequence via Entity State Tuning for Temporal Knowledge Graph Forecasting

This paper presents Entity State Tuning (EST), a novel framework for improving temporal knowledge graph forecasting by maintaining persis...

arXiv - AI · 4 min · about 2 months ago

Llms

Customizable AI Companions.

The article discusses the potential of customizable AI companions that can engage in real-time video calls, leveraging technologies like ...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Machine Learning

[D] METR TH1.1: “working_time” is wildly different across models. Quick breakdown + questions.

The article discusses METR's Time Horizon benchmark (TH1.1), highlighting significant differences in 'working_time' across various models...

Reddit - Machine Learning · 1 min · about 2 months ago

Ai Agents

My awkward first date with an AI companion

A Mashable writer experiences an awkward date with an EVA AI companion at a pop-up cafe, exploring the nuances of AI relationships and us...

AI Tools & Products · 11 min · about 2 months ago

Ai Agents

Rethinking human connection under the influence of AI.

The article explores the nature of human connection in the context of AI interactions, arguing that while AI can simulate dialogue, it la...

AI Tools & Products · 5 min · about 2 months ago

Llms

Looking for early testers for my competitive analysis tool (Claude needed currently)

The article seeks early testers for CompetitiveOS, a tool designed to streamline competitive analysis in the AI education sector by autom...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Ai Safety

Ads in AI chatbots raise privacy concerns as companies seek new revenue

The introduction of ads in AI chatbots raises privacy concerns as companies like OpenAI and Microsoft explore new revenue models amidst u...

AI Tools & Products · 5 min · about 2 months ago

Ai Startups

Why Nonprofits Can’t Afford to Ignore AI

This article discusses the importance of AI for nonprofits, emphasizing how these organizations can leverage technology to enhance their ...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Ai Agents

OpenClaw founder Peter Steinberger is joining OpenAI | The Verge

Peter Steinberger, founder of OpenClaw, is joining OpenAI, with the OpenClaw project continuing as an open-source initiative.

The Verge - AI · 4 min · about 2 months ago

Previous Page 154 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

Been building a multi-agent framework in public for 5 weeks, its been a Journey.

"There's a new generation of empirical deep learning researchers, hacking away at whatever seems trendy, blowing with the wind" [D]

Alibaba-linked AI agent hijacked GPUs for unauthorized crypto mining, researchers say

All Content

[2602.12963] Information-theoretic analysis of world models in optimal reward maximizers

[2602.12876] BrowseComp-$V^3$: A Visual, Vertical, and Verifiable Benchmark for Multimodal Browsing Agents

[2602.12852] WebClipper: Efficient Evolution of Web Agents with Graph-based Trajectory Pruning

[2602.12670] SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

[2602.12662] Think Fast and Slow: Step-Level Cognitive Depth Adaptation for LLM Agents

[2602.12631] AI Agents for Inventory Control: Human-LLM-OR Complementarity

[2602.12617] GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

[2602.12566] To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

[2602.12419] Intent-Driven Smart Manufacturing Integrating Knowledge Graphs and Large Language Models

[2602.12544] Scaling Web Agent Training through Automatic Data Generation and Fine-grained Evaluation

[2602.12316] GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory

[2602.12389] Evolving Beyond Snapshots: Harmonizing Structure and Sequence via Entity State Tuning for Temporal Knowledge Graph Forecasting

Customizable AI Companions.

[D] METR TH1.1: “working_time” is wildly different across models. Quick breakdown + questions.

My awkward first date with an AI companion

Rethinking human connection under the influence of AI.

Looking for early testers for my competitive analysis tool (Claude needed currently)

Ads in AI chatbots raise privacy concerns as companies seek new revenue

Why Nonprofits Can’t Afford to Ignore AI

OpenClaw founder Peter Steinberger is joining OpenAI | The Verge

Related Topics

Stay updated with AI News