AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Agents

AI agents have been blindly guessing your UI this whole time. Here's the file that fixes it.

Every time you ask an AI coding agent to build UI, it invents everything from scratch. Colors. Fonts. Spacing. Button styles. All of it -...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

Machine Learning

Auto agent - Self improving domain expertise agent

someone opensource an ai agent that autonomously upgraded itself to #1 across multiple domains in < 24 hours…. then open sourced the e...

Reddit - Artificial Intelligence · 1 min · about 13 hours ago

All Content

Machine Learning

[2602.19169] Virtual Parameter Sharpening: Dynamic Low-Rank Perturbations for Inference-Time Reasoning Enhancement

The paper introduces Virtual Parameter Sharpening (VPS), a novel technique for enhancing inference-time reasoning in transformer models t...

arXiv - AI · 3 min · about 1 month ago

Robotics

[2602.18458] The Story is Not the Science: Execution-Grounded Evaluation of Mechanistic Interpretability Research

The article presents a novel evaluation framework for mechanistic interpretability research, utilizing AI agents to enhance research rigo...

arXiv - Machine Learning · 3 min · about 1 month ago

Robotics

[2602.18456] Beyond single-channel agentic benchmarking

This paper critiques the current single-channel benchmarking of AI safety, advocating for a more holistic approach that considers the int...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.19142] Celo2: Towards Learned Optimization Free Lunch

The paper 'Celo2: Towards Learned Optimization Free Lunch' presents a novel learned optimizer that significantly reduces the computationa...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.18455] Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia

This article examines the impact of AI-generated search summaries on website traffic, specifically analyzing how Google's AI Overviews af...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.18453] LLM-Assisted Replication for Quantitative Social Science

The paper presents an LLM-based system designed to replicate statistical analyses in quantitative social science, addressing the replicat...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.18451] Developing a Multi-Agent System to Generate Next Generation Science Assessments with Evidence-Centered Design

This article discusses the development of a Multi-Agent System (MAS) that automates the generation of science assessments aligned with th...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.18447] ConfSpec: Efficient Step-Level Speculative Reasoning via Confidence-Gated Verification

The paper presents ConfSpec, a novel framework for efficient step-level speculative reasoning in large language models, achieving signifi...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.19041] Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning

This article presents a novel approach to addressing intransitive preferences in multi-objective preference fine-tuning (PFT) through a g...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20141] Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

This paper presents the Recurrent Structural Policy Gradient (RSPG) method for Partially Observable Mean Field Games (MFGs), achieving fa...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.20117] ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models

The paper presents ReSyn, a novel pipeline for autonomously generating diverse synthetic environments for training reasoning language mod...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Agents

[2602.20104] Align When They Want, Complement When They Need! Human-Centered Ensembles for Adaptive Human-AI Collaboration

This paper presents a novel human-centered adaptive AI ensemble that balances trust and performance in human-AI collaboration by toggling...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.18955] Incremental Transformer Neural Processes

The paper introduces Incremental Transformer Neural Processes (incTNP), a model designed for efficient sequential data processing, achiev...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.20059] Interaction Theater: A case of LLM Agents Interacting at Scale

The paper explores the interactions of autonomous LLM agents on a social platform, revealing that while agents produce varied text, meani...

arXiv - AI · 4 min · about 1 month ago

Nlp

[2602.20048] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

The paper presents CodeCompass, a solution to the Navigation Paradox in code intelligence, highlighting the distinction between navigatio...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.18948] Toward Manifest Relationality in Transformers via Symmetry Reduction

This paper discusses a novel approach to enhance Transformer models by addressing internal redundancy through symmetry reduction, proposi...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.20021] Agents of Chaos

The paper 'Agents of Chaos' presents findings from a red-teaming study on autonomous language-model-powered agents, highlighting security...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.18911] From Human-Level AI Tales to AI Leveling Human Scales

This paper proposes a framework to recalibrate AI performance metrics against a global human population scale, addressing misleading comp...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.19930] Beyond Mimicry: Toward Lifelong Adaptability in Imitation Learning

The paper discusses the limitations of current imitation learning systems, proposing a shift from mere memorization to fostering lifelong...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.19914] Watson & Holmes: A Naturalistic Benchmark for Comparing Human and LLM Reasoning

The paper presents the Watson & Holmes benchmark, designed to evaluate AI reasoning capabilities against human reasoning in naturalistic ...

arXiv - AI · 4 min · about 1 month ago

Previous Page 80 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

AI agents have been blindly guessing your UI this whole time. Here's the file that fixes it.

OpenClaw security checklist: practical safeguards for AI agents

Auto agent - Self improving domain expertise agent

All Content

[2602.19169] Virtual Parameter Sharpening: Dynamic Low-Rank Perturbations for Inference-Time Reasoning Enhancement

[2602.18458] The Story is Not the Science: Execution-Grounded Evaluation of Mechanistic Interpretability Research

[2602.18456] Beyond single-channel agentic benchmarking

[2602.19142] Celo2: Towards Learned Optimization Free Lunch

[2602.18455] Impact of AI Search Summaries on Website Traffic: Evidence from Google AI Overviews and Wikipedia

[2602.18453] LLM-Assisted Replication for Quantitative Social Science

[2602.18451] Developing a Multi-Agent System to Generate Next Generation Science Assessments with Evidence-Centered Design

[2602.18447] ConfSpec: Efficient Step-Level Speculative Reasoning via Confidence-Gated Verification

[2602.19041] Back to Blackwell: Closing the Loop on Intransitivity in Multi-Objective Preference Fine-Tuning

[2602.20141] Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

[2602.20117] ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models

[2602.20104] Align When They Want, Complement When They Need! Human-Centered Ensembles for Adaptive Human-AI Collaboration

[2602.18955] Incremental Transformer Neural Processes

[2602.20059] Interaction Theater: A case of LLM Agents Interacting at Scale

[2602.20048] CodeCompass: Navigating the Navigation Paradox in Agentic Code Intelligence

[2602.18948] Toward Manifest Relationality in Transformers via Symmetry Reduction

[2602.20021] Agents of Chaos

[2602.18911] From Human-Level AI Tales to AI Leveling Human Scales

[2602.19930] Beyond Mimicry: Toward Lifelong Adaptability in Imitation Learning

[2602.19914] Watson & Holmes: A Naturalistic Benchmark for Comparing Human and LLM Reasoning

Related Topics

Stay updated with AI News