Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch
Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min ·
Llms

Von Hammerstein’s Ghost: What a Prussian General’s Officer Typology Can Teach Us About AI Misalignment

Greetings all - I've posted mostly in r/claudecode and r/aigamedev a couple of times previously. Working with CC for personal projects re...

Reddit - Artificial Intelligence · 1 min ·
Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2603.21584] SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models
Llms

[2603.21584] SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models

Abstract page for arXiv paper 2603.21584: SSAM: Singular Subspace Alignment for Merging Multimodal Large Language Models

arXiv - Machine Learning · 4 min ·
[2603.21567] Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy
Llms

[2603.21567] Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

Abstract page for arXiv paper 2603.21567: Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

arXiv - Machine Learning · 3 min ·
[2603.21534] Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equations
Llms

[2603.21534] Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equations

Abstract page for arXiv paper 2603.21534: Generalization Limits of In-Context Operator Networks for Higher-Order Partial Differential Equ...

arXiv - Machine Learning · 3 min ·
[2603.21396] Mechanisms of Introspective Awareness
Llms

[2603.21396] Mechanisms of Introspective Awareness

Abstract page for arXiv paper 2603.21396: Mechanisms of Introspective Awareness

arXiv - Machine Learning · 3 min ·
[2603.21373] PLR: Plackett-Luce for Reordering In-Context Learning Examples
Llms

[2603.21373] PLR: Plackett-Luce for Reordering In-Context Learning Examples

Abstract page for arXiv paper 2603.21373: PLR: Plackett-Luce for Reordering In-Context Learning Examples

arXiv - Machine Learning · 3 min ·
[2603.21365] TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference
Llms

[2603.21365] TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

Abstract page for arXiv paper 2603.21365: TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

arXiv - Machine Learning · 4 min ·
[2603.21354] The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project
Llms

[2603.21354] The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project

Abstract page for arXiv paper 2603.21354: The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the v...

arXiv - Machine Learning · 4 min ·
[2603.21170] Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models
Llms

[2603.21170] Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

Abstract page for arXiv paper 2603.21170: Pruned Adaptation Modules: A Simple yet Strong Baseline for Continual Foundation Models

arXiv - Machine Learning · 4 min ·
[2603.21105] ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models
Llms

[2603.21105] ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models

Abstract page for arXiv paper 2603.21105: ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Lan...

arXiv - Machine Learning · 4 min ·
[2603.21014] CLT-Forge: A Scalable Library for Cross-Layer Transcoders and Attribution Graphs
Llms

[2603.21014] CLT-Forge: A Scalable Library for Cross-Layer Transcoders and Attribution Graphs

Abstract page for arXiv paper 2603.21014: CLT-Forge: A Scalable Library for Cross-Layer Transcoders and Attribution Graphs

arXiv - Machine Learning · 3 min ·
[2603.20969] Understanding Contextual Recall in Transformers: How Finetuning Enables In-Context Reasoning over Pretraining Knowledge
Llms

[2603.20969] Understanding Contextual Recall in Transformers: How Finetuning Enables In-Context Reasoning over Pretraining Knowledge

Abstract page for arXiv paper 2603.20969: Understanding Contextual Recall in Transformers: How Finetuning Enables In-Context Reasoning ov...

arXiv - Machine Learning · 4 min ·
[2603.20921] Discriminative Representation Learning for Clinical Prediction
Llms

[2603.20921] Discriminative Representation Learning for Clinical Prediction

Abstract page for arXiv paper 2603.20921: Discriminative Representation Learning for Clinical Prediction

arXiv - Machine Learning · 3 min ·
[2603.20910] LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models
Llms

[2603.20910] LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models

Abstract page for arXiv paper 2603.20910: LLM-ODE: Data-driven Discovery of Dynamical Systems with Large Language Models

arXiv - Machine Learning · 3 min ·
[2603.20825] Cross-Granularity Representations for Biological Sequences: Insights from ESM and BiGCARP
Llms

[2603.20825] Cross-Granularity Representations for Biological Sequences: Insights from ESM and BiGCARP

Abstract page for arXiv paper 2603.20825: Cross-Granularity Representations for Biological Sequences: Insights from ESM and BiGCARP

arXiv - Machine Learning · 4 min ·
[2603.20632] Optimal low-rank stochastic gradient estimation for LLM training
Llms

[2603.20632] Optimal low-rank stochastic gradient estimation for LLM training

Abstract page for arXiv paper 2603.20632: Optimal low-rank stochastic gradient estimation for LLM training

arXiv - Machine Learning · 3 min ·
[2603.20587] Neural collapse in the orthoplex regime
Llms

[2603.20587] Neural collapse in the orthoplex regime

Abstract page for arXiv paper 2603.20587: Neural collapse in the orthoplex regime

arXiv - Machine Learning · 3 min ·
[2603.20572] LJ-Bench: Ontology-Based Benchmark for U.S. Crime
Llms

[2603.20572] LJ-Bench: Ontology-Based Benchmark for U.S. Crime

Abstract page for arXiv paper 2603.20572: LJ-Bench: Ontology-Based Benchmark for U.S. Crime

arXiv - Machine Learning · 3 min ·
[2603.20538] Understanding Behavior Cloning with Action Quantization
Llms

[2603.20538] Understanding Behavior Cloning with Action Quantization

Abstract page for arXiv paper 2603.20538: Understanding Behavior Cloning with Action Quantization

arXiv - Machine Learning · 3 min ·
[2603.20492] AE-LLM: Adaptive Efficiency Optimization for Large Language Models
Llms

[2603.20492] AE-LLM: Adaptive Efficiency Optimization for Large Language Models

Abstract page for arXiv paper 2603.20492: AE-LLM: Adaptive Efficiency Optimization for Large Language Models

arXiv - Machine Learning · 4 min ·
[2603.20405] Putnam 2025 Problems in Rocq using Opus 4.6 and Rocq-MCP
Llms

[2603.20405] Putnam 2025 Problems in Rocq using Opus 4.6 and Rocq-MCP

Abstract page for arXiv paper 2603.20405: Putnam 2025 Problems in Rocq using Opus 4.6 and Rocq-MCP

arXiv - Machine Learning · 3 min ·
Previous Page 31 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime