Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General · 19 minutes ago

Llms

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

Anthropic launches Claude Managed Agents in public beta — composable APIs for shipping production AI agents 10x faster Handles sandboxing...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous

Six months ago I committed to using AI tools for everything I possibly could in my work. Every day, every task, every workflow. Here's th...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

All Content

Llms

[2603.02675] From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

Abstract page for arXiv paper 2603.02675: From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2504.21023] Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Abstract page for arXiv paper 2504.21023: Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03258] Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals

Abstract page for arXiv paper 2603.03258: Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02635] SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety

Abstract page for arXiv paper 2603.02635: SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03242] Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

Abstract page for arXiv paper 2603.03242: Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02630] MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks

Abstract page for arXiv paper 2603.02630: MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03233] AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework

Abstract page for arXiv paper 2603.03233: AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03203] No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

Abstract page for arXiv paper 2603.03203: No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Langu...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.02604] Heterogeneous Agent Collaborative Reinforcement Learning

Abstract page for arXiv paper 2603.02604: Heterogeneous Agent Collaborative Reinforcement Learning

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2603.03175] Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

Abstract page for arXiv paper 2603.03175: Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03147] Agentic AI-based Coverage Closure for Formal Verification

Abstract page for arXiv paper 2603.03147: Agentic AI-based Coverage Closure for Formal Verification

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03080] Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation

Abstract page for arXiv paper 2603.03080: Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Reco...

arXiv - AI · 3 min · about 1 month ago

Llms

[2603.03116] Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation

Abstract page for arXiv paper 2603.03116: Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02510] ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

Abstract page for arXiv paper 2603.02510: ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evol...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03002] SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

Abstract page for arXiv paper 2603.03002: SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.02482] MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

Abstract page for arXiv paper 2603.02482: MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2603.03078] RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

Abstract page for arXiv paper 2603.03078: RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03072] TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

Abstract page for arXiv paper 2603.03072: TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03018] REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry

Abstract page for arXiv paper 2603.03018: REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise T...

arXiv - AI · 4 min · about 1 month ago

Llms

[2603.03005] OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Structured Agents

Abstract page for arXiv paper 2603.03005: OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Struct...

arXiv - AI · 4 min · about 1 month ago

Previous Page 151 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

8 free AI courses from Anthropic’s Claude platform with certificates

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous

All Content

[2603.02675] From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

[2504.21023] Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

[2603.03258] Inherited Goal Drift: Contextual Pressure Can Undermine Agentic Goals

[2603.02635] SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety

[2603.03242] Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

[2603.02630] MASPOB: Bandit-Based Prompt Optimization for Multi-Agent Systems with Graph Neural Networks

[2603.03233] AI-for-Science Low-code Platform with Bayesian Adversarial Multi-Agent Framework

[2603.03203] No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

[2603.02604] Heterogeneous Agent Collaborative Reinforcement Learning

[2603.03175] Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

[2603.03147] Agentic AI-based Coverage Closure for Formal Verification

[2603.03080] Beyond Factual Correctness: Mitigating Preference-Inconsistent Explanations in Explainable Recommendation

[2603.03116] Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation

[2603.02510] ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

[2603.03002] SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models

[2603.02482] MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

[2603.03078] RAPO: Expanding Exploration for LLM Agents via Retrieval-Augmented Policy Optimization

[2603.03072] TikZilla: Scaling Text-to-TikZ with High-Quality Data and Reinforcement Learning

[2603.03018] REGAL: A Registry-Driven Architecture for Deterministic Grounding of Agentic AI in Enterprise Telemetry

[2603.03005] OrchMAS: Orchestrated Reasoning with Multi Collaborative Heterogeneous Scientific Expert Structured Agents

Related Topics

Stay updated with AI News