Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

Hi Guys, My company is considering purchasing the Claude Enterprise plan. The main two constraints are: - Being able to block usage of Cl...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min · about 5 hours ago

All Content

Llms

[2603.20218] An experimental study of KV cache reuse strategies in chunk-level caching systems

Abstract page for arXiv paper 2603.20218: An experimental study of KV cache reuse strategies in chunk-level caching systems

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.20215] Multi-Agent Debate with Memory Masking

Abstract page for arXiv paper 2603.20215: Multi-Agent Debate with Memory Masking

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.20212] Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

Abstract page for arXiv paper 2603.20212: Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.20217] Expected Reward Prediction, with Applications to Model Routing

Abstract page for arXiv paper 2603.20217: Expected Reward Prediction, with Applications to Model Routing

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.22206] Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Abstract page for arXiv paper 2603.22206: Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.22184] Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Abstract page for arXiv paper 2603.22184: Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.22161] Causal Evidence that Language Models use Confidence to Drive Behavior

Abstract page for arXiv paper 2603.22161: Causal Evidence that Language Models use Confidence to Drive Behavior

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.22154] dynActivation: A Trainable Activation Family for Adaptive Nonlinearity

Abstract page for arXiv paper 2603.22154: dynActivation: A Trainable Activation Family for Adaptive Nonlinearity

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.22017] AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing

Abstract page for arXiv paper 2603.22017: AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.21972] Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe

Abstract page for arXiv paper 2603.21972: Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.21862] Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization

Abstract page for arXiv paper 2603.21862: Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.21705] Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

Abstract page for arXiv paper 2603.21705: Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.21584] SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models

Abstract page for arXiv paper 2603.21584: SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.21567] Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

Abstract page for arXiv paper 2603.21567: Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.21534] Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equations

Abstract page for arXiv paper 2603.21534: Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equ...

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.21396] Mechanisms of Introspective Awareness

Abstract page for arXiv paper 2603.21396: Mechanisms of Introspective Awareness

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.21373] PLR: Plackett-Luce for Reordering In-Context Learning Examples

Abstract page for arXiv paper 2603.21373: PLR: Plackett-Luce for Reordering In-Context Learning Examples

arXiv - Machine Learning · 3 min · 8 days ago

Llms

[2603.21365] TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

Abstract page for arXiv paper 2603.21365: TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.21354] The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project

Abstract page for arXiv paper 2603.21354: The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the v...

arXiv - Machine Learning · 4 min · 8 days ago

Llms

[2603.21170] Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

Abstract page for arXiv paper 2603.21170: Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

arXiv - Machine Learning · 4 min · 8 days ago

Previous Page 43 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

Have Companies Began Adopting Claude Co-Work at an Enterprise Level?

What I learned about multi-agent coordination running 9 specialized Claude agents

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

All Content

[2603.20218] An experimental study of KV cache reuse strategies in chunk-level caching systems

[2603.20215] Multi-Agent Debate with Memory Masking

[2603.20212] Fast-Slow Thinking RM: Efficient Integration of Scalar and Generative Reward Models

[2603.20217] Expected Reward Prediction, with Applications to Model Routing

[2603.22206] Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

[2603.22184] Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

[2603.22161] Causal Evidence that Language Models use Confidence to Drive Behavior

[2603.22154] dynActivation: A Trainable Activation Family for Adaptive Nonlinearity

[2603.22017] AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing

[2603.21972] Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe

[2603.21862] Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization

[2603.21705] Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

[2603.21584] SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models

[2603.21567] Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

[2603.21534] Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equations

[2603.21396] Mechanisms of Introspective Awareness

[2603.21373] PLR: Plackett-Luce for Reordering In-Context Learning Examples

[2603.21365] TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

[2603.21354] The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project

[2603.21170] Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

Related Topics

Stay updated with AI News