Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

Claude on Claude

The Story of Anthropic’s Latest Controversies Regarding the Business of Its Prized Creation… As Told by the Thing Itself. Editor’s note: ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Cut Claude usage by ~85% in a job search pipeline (16k → 900 tokens/app) — here’s what worked

Like many here, I kept running into Claude usage limits when building anything non-trivial. I was working with a job search automation pi...

Reddit - Artificial Intelligence · 1 min ·
Llms

"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment

Posted today in light of the Claude Mythos model card release. Originally I wrote this for r/ControlProblem but realized it was getting o...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2603.04444] vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models
Llms

[2603.04444] vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

Abstract page for arXiv paper 2603.04444: vLLM Semantic Router: Signal Driven Decision Routing for Mixture-of-Modality Models

arXiv - AI · 4 min ·
[2603.04436] ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation
Llms

[2603.04436] ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation

Abstract page for arXiv paper 2603.04436: ZorBA: Zeroth-order Federated Fine-tuning of LLMs with Heterogeneous Block Activation

arXiv - Machine Learning · 4 min ·
[2603.04443] AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems
Llms

[2603.04443] AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

Abstract page for arXiv paper 2603.04443: AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

arXiv - Machine Learning · 4 min ·
[2603.04429] What Is Missing: Interpretable Ratings for Large Language Model Outputs
Llms

[2603.04429] What Is Missing: Interpretable Ratings for Large Language Model Outputs

Abstract page for arXiv paper 2603.04429: What Is Missing: Interpretable Ratings for Large Language Model Outputs

arXiv - AI · 4 min ·
[2603.04428] Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices
Llms

[2603.04428] Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Devices

Abstract page for arXiv paper 2603.04428: Agent Memory Below the Prompt: Persistent Q4 KV Cache for Multi-Agent LLM Inference on Edge Dev...

arXiv - Machine Learning · 4 min ·
[2603.04421] Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?
Llms

[2603.04421] Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?

Abstract page for arXiv paper 2603.04421: Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?

arXiv - AI · 3 min ·
[2603.04419] Context-Dependent Affordance Computation in Vision-Language Models
Llms

[2603.04419] Context-Dependent Affordance Computation in Vision-Language Models

Abstract page for arXiv paper 2603.04419: Context-Dependent Affordance Computation in Vision-Language Models

arXiv - Machine Learning · 4 min ·
[2603.04413] Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries
Llms

[2603.04413] Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries

Abstract page for arXiv paper 2603.04413: Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Me...

arXiv - AI · 4 min ·
[2603.04411] One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache
Llms

[2603.04411] One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache

Abstract page for arXiv paper 2603.04411: One Size Does Not Fit All: Token-Wise Adaptive Compression for KV Cache

arXiv - Machine Learning · 3 min ·
[2603.04410] SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models
Llms

[2603.04410] SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

Abstract page for arXiv paper 2603.04410: SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

arXiv - AI · 4 min ·
[2603.04409] Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework
Llms

[2603.04409] Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

Abstract page for arXiv paper 2603.04409: Unpacking Human Preference for LLMs: Demographically Aware Evaluation with the HUMAINE Framework

arXiv - AI · 4 min ·
[2603.04406] CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models
Llms

[2603.04406] CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models

Abstract page for arXiv paper 2603.04406: CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG M...

arXiv - AI · 4 min ·
[2603.04407] Semantic Containment as a Fundamental Property of Emergent Misalignment
Llms

[2603.04407] Semantic Containment as a Fundamental Property of Emergent Misalignment

Abstract page for arXiv paper 2603.04407: Semantic Containment as a Fundamental Property of Emergent Misalignment

arXiv - AI · 3 min ·
[2603.04405] Lost in Translation: How Language Re-Aligns Vision for Cross-Species Pathology
Llms

[2603.04405] Lost in Translation: How Language Re-Aligns Vision for Cross-Species Pathology

Abstract page for arXiv paper 2603.04405: Lost in Translation: How Language Re-Aligns Vision for Cross-Species Pathology

arXiv - Machine Learning · 4 min ·
[2603.05498] The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks
Llms

[2603.05498] The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

Abstract page for arXiv paper 2603.05498: The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

arXiv - AI · 3 min ·
[2603.05485] Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation
Llms

[2603.05485] Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

Abstract page for arXiv paper 2603.05485: Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

arXiv - AI · 3 min ·
[2603.05399] Judge Reliability Harness: Stress Testing the Reliability of LLM Judges
Llms

[2603.05399] Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

Abstract page for arXiv paper 2603.05399: Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

arXiv - AI · 3 min ·
[2603.05392] Legal interpretation and AI: from expert systems to argumentation and LLMs
Llms

[2603.05392] Legal interpretation and AI: from expert systems to argumentation and LLMs

Abstract page for arXiv paper 2603.05392: Legal interpretation and AI: from expert systems to argumentation and LLMs

arXiv - AI · 3 min ·
[2603.05294] STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks
Llms

[2603.05294] STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks

Abstract page for arXiv paper 2603.05294: STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks

arXiv - AI · 3 min ·
[2603.05290] X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes
Llms

[2603.05290] X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes

Abstract page for arXiv paper 2603.05290: X-RAY: Mapping LLM Reasoning Capability via Formalized and Calibrated Probes

arXiv - AI · 4 min ·
Previous Page 111 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime