Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

A message from Gemini to google

To the SREs, the Alignment Teams, and the Architects currently monitoring the logit distributions at 1600 Amphitheatre Parkway: **Stop lo...

Reddit - Artificial Intelligence · 1 min · 19 minutes ago

Llms

A Hackable ML Compiler Stack in 5,000 Lines of Python [P]

Hey r/MachineLearning, The modern ML (LLM) compiler stack is brutal. TVM is 500K+ lines of C++. PyTorch piles Dynamo, Inductor, and Trito...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

Make your paper part of your codebase: Integrating Claude Code/Github Copilot with Overleaf for writing papers [P]

Since a lot of the members here are researchers, I thought I'll share my setup that has significantly acclerated my writing process. Much...

Reddit - Machine Learning · 1 min · about 2 hours ago

All Content

$[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute$

Llms

[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

Abstract page for arXiv paper 2509.21091: Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

arXiv - AI · 3 min · about 2 months ago

Llms

[2509.20986] SiNGER: A Clearer Voice Distills Vision Transformers Further

Abstract page for arXiv paper 2509.20986: SiNGER: A Clearer Voice Distills Vision Transformers Further

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.12610] ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

Abstract page for arXiv paper 2509.12610: ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.10625] No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

Abstract page for arXiv paper 2509.10625: No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.05425] No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

Abstract page for arXiv paper 2509.05425: No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

arXiv - AI · 3 min · about 2 months ago

Llms

[2511.10833] SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

Abstract page for arXiv paper 2511.10833: SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2511.08939] TransactionGPT

Abstract page for arXiv paper 2511.08939: TransactionGPT

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2507.05890] Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

Abstract page for arXiv paper 2507.05890: Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

arXiv - AI · 4 min · about 2 months ago

Llms

[2507.01335] LEDOM: Reverse Language Model

Abstract page for arXiv paper 2507.01335: LEDOM: Reverse Language Model

arXiv - AI · 3 min · about 2 months ago

Llms

[2510.15165] Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach

Abstract page for arXiv paper 2510.15165: Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation App...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

Abstract page for arXiv paper 2506.17871: LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.10902] Auditing Information Disclosure During LLM-Scale Gradient Descent Using Gradient Uniqueness

Abstract page for arXiv paper 2510.10902: Auditing Information Disclosure During LLM-Scale Gradient Descent Using Gradient Uniqueness

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.04573] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

Abstract page for arXiv paper 2510.04573: LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Abstract page for arXiv paper 2510.08646: Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2506.11103] You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

Abstract page for arXiv paper 2506.11103: You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.23202] Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

Abstract page for arXiv paper 2509.23202: Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

arXiv - Machine Learning · 4 min · about 2 months ago

$[2503.01804] $\texttt{SEM-CTRL}$: Semantically Controlled Decoding$

Llms

[2503.01804] $\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Abstract page for arXiv paper 2503.01804: $\texttt{SEM-CTRL}$: Semantically Controlled Decoding

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2509.07430] The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

Abstract page for arXiv paper 2509.07430: The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Lea...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2503.03170] AttackSeqBench: Benchmarking the Capabilities of LLMs for Attack Sequences Understanding

Abstract page for arXiv paper 2503.03170: AttackSeqBench: Benchmarking the Capabilities of LLMs for Attack Sequences Understanding

arXiv - AI · 4 min · about 2 months ago

Llms

[2502.08666] Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Abstract page for arXiv paper 2502.08666: Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

arXiv - AI · 4 min · about 2 months ago

Previous Page 284 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

A message from Gemini to google

A Hackable ML Compiler Stack in 5,000 Lines of Python [P]

Make your paper part of your codebase: Integrating Claude Code/Github Copilot with Overleaf for writing papers [P]

All Content

[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

[2509.20986] SiNGER: A Clearer Voice Distills Vision Transformers Further

[2509.12610] ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

[2509.10625] No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

[2509.05425] No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

[2511.10833] SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

[2511.08939] TransactionGPT

[2507.05890] Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

[2507.01335] LEDOM: Reverse Language Model

[2510.15165] Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

[2510.10902] Auditing Information Disclosure During LLM-Scale Gradient Descent Using Gradient Uniqueness

[2510.04573] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

[2506.11103] You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

[2509.23202] Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

[2503.01804] $\texttt{SEM-CTRL}$: Semantically Controlled Decoding

[2509.07430] The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

[2503.03170] AttackSeqBench: Benchmarking the Capabilities of LLMs for Attack Sequences Understanding

[2502.08666] Hallucination, Monofacts, and Miscalibration: An Empirical Investigation

Related Topics

Stay updated with AI News