AI Agents
Autonomous agents, tool use, and agentic systems
Top This Week
Started a video series on building an orchestration layer for LLM post-training [P]
Hi everyone! Context, motivation, a lot of yapping, feel free to skip to TL;DR. A while back I posted here asking [D] What framework do y...
All Content
[2602.15654] Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections
This paper discusses the security vulnerabilities of self-evolving LLM agents, introducing the concept of 'Zombie Agents' that can be cov...
[2504.05654] Curved representational Bregman divergences and their applications
This article introduces curved representational Bregman divergences, exploring their mathematical foundations and applications in informa...
[2602.15600] The geometry of online conversations and the causal antecedents of conflictual discourse
This article explores the dynamics of conflictual discourse in online conversations, particularly focusing on climate change discussions....
[2602.15564] Beyond Static Pipelines: Learning Dynamic Workflows for Text-to-SQL
The paper presents a novel approach to Text-to-SQL systems by introducing dynamic workflows that adapt during inference, enhancing perfor...
[2602.11618] How Well Do Large-Scale Chemical Language Models Transfer to Downstream Tasks?
This paper evaluates the effectiveness of large-scale Chemical Language Models (CLMs) in transferring knowledge to downstream molecular p...
[2602.10706] Reducing Estimation Uncertainty Using Normalizing Flows and Stratification
This paper presents a novel approach to reducing estimation uncertainty in statistical analysis using normalizing flows and stratified sa...
[2602.08032] Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models
The paper presents Horizon Imagination (HI), an innovative on-policy imagination process for reinforcement learning using diffusion-based...
[2602.07418] Achieving Optimal Static and Dynamic Regret Simultaneously in Bandits with Deterministic Losses
This paper presents an algorithm that achieves optimal static and dynamic regret simultaneously in adversarial multi-armed bandits with d...
[2602.15513] Improving MLLMs in Embodied Exploration and Question Answering with Human-Inspired Memory Modeling
This paper presents a novel non-parametric memory framework for improving Multimodal Large Language Models (MLLMs) in embodied exploratio...
[2602.05999] On the Role of Iterative Computation in Reinforcement Learning
This paper explores how the amount of compute available to reinforcement learning (RL) policies influences their learning capabilities an...
[2601.19720] Improving Policy Exploitation in Online Reinforcement Learning with Instant Retrospect Action
The paper presents a novel algorithm, Instant Retrospect Action (IRA), aimed at enhancing policy exploitation in online reinforcement lea...
[2601.10498] PROMA: Projected Microbatch Accumulation for Reference-Free Proximal Policy Updates
The paper introduces Projected Microbatch Accumulation (PROMA), a novel method for proximal policy updates that enhances KL divergence co...
[2602.15439] Algorithmic Approaches to Opinion Selection for Online Deliberation: A Comparative Study
This article examines various algorithmic approaches to opinion selection in online deliberation, highlighting the trade-offs between div...
[2602.15397] ActionCodec: What Makes for Good Action Tokenizers
The paper introduces ActionCodec, a novel action tokenizer designed to enhance Vision-Language-Action (VLA) models by optimizing tokeniza...
[2602.15377] Orchestration-Free Customer Service Automation: A Privacy-Preserving and Flowchart-Guided Framework
This paper presents an orchestration-free framework for customer service automation, utilizing Task-Oriented Flowcharts (TOFs) to enhance...
[2602.15362] Automated Multi-Source Debugging and Natural Language Error Explanation for Dashboard Applications
This paper presents a novel system for Automated Multi-Source Debugging and Natural Language Error Explanation, aimed at improving user e...
[2602.15353] NeuroSymActive: Differentiable Neural-Symbolic Reasoning with Active Exploration for Knowledge Graph Question Answering
The paper presents NeuroSymActive, a novel framework for Knowledge Graph Question Answering that integrates differentiable neural-symboli...
[2602.15350] Fine-Tuning LLMs to Generate Economical and Reliable Actions for the Power Grid
This paper discusses a method for fine-tuning large language models (LLMs) to generate effective corrective actions for power grid manage...
[2510.26792] Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability
This article explores how Transformer models can learn sequences generated by Permuted Congruential Generators (PCGs), demonstrating thei...
[2510.03269] General Exploratory Bonus for Optimistic Exploration in RLHF
This paper introduces the General Exploratory Bonus (GEB) framework, which enhances optimistic exploration in reinforcement learning with...
Related Topics
Stay updated with AI News
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime