Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

People ask AI relationship questions all the time, from "Does this person like me?" to "Should I text back?" But have you ever thought ab...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

SmolLM2 135M. Lenovo T14 CPU. No GPU. No RLHF. No BPE. Coherent, non-sycophantic, contextually appropriate output. First message. No prio...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

OpenClaw + Claude might get harder to use going forward (creator just confirmed)

Just saw a post from Peter Steinberger (creator of OpenClaw) saying that it’s likely going to get harder in the future to keep OpenClaw w...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

All Content

Llms

[2510.13315] Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

Abstract page for arXiv paper 2510.13315: Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.06084] Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

Abstract page for arXiv paper 2510.06084: Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.22641] Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

Abstract page for arXiv paper 2509.22641: Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

arXiv - AI · 4 min · about 1 month ago

$[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute$

Llms

[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

Abstract page for arXiv paper 2509.21091: Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

arXiv - AI · 3 min · about 1 month ago

Llms

[2509.20986] SiNGER: A Clearer Voice Distills Vision Transformers Further

Abstract page for arXiv paper 2509.20986: SiNGER: A Clearer Voice Distills Vision Transformers Further

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.12610] ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

Abstract page for arXiv paper 2509.12610: ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2509.10625] No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

Abstract page for arXiv paper 2509.10625: No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.05425] No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

Abstract page for arXiv paper 2509.05425: No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

arXiv - AI · 3 min · about 1 month ago

Llms

[2511.10833] SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

Abstract page for arXiv paper 2511.10833: SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.08939] TransactionGPT

Abstract page for arXiv paper 2511.08939: TransactionGPT

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2507.05890] Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

Abstract page for arXiv paper 2507.05890: Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

arXiv - AI · 4 min · about 1 month ago

Llms

[2507.01335] LEDOM: Reverse Language Model

Abstract page for arXiv paper 2507.01335: LEDOM: Reverse Language Model

arXiv - AI · 3 min · about 1 month ago

Llms

[2510.15165] Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach

Abstract page for arXiv paper 2510.15165: Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation App...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

Abstract page for arXiv paper 2506.17871: LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.10902] Auditing Information Disclosure During LLM-Scale Gradient Descent Using Gradient Uniqueness

Abstract page for arXiv paper 2510.10902: Auditing Information Disclosure During LLM-Scale Gradient Descent Using Gradient Uniqueness

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.04573] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

Abstract page for arXiv paper 2510.04573: LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Abstract page for arXiv paper 2510.08646: Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2506.11103] You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

Abstract page for arXiv paper 2506.11103: You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.23202] Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

Abstract page for arXiv paper 2509.23202: Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

arXiv - Machine Learning · 4 min · about 1 month ago

$[2503.01804] $\texttt{SEM-CTRL}$: Semantically Controlled Decoding$

Llms

[2503.01804] $\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Abstract page for arXiv paper 2503.01804: $\texttt{SEM-CTRL}$: Semantically Controlled Decoding

arXiv - Machine Learning · 3 min · about 1 month ago

Previous Page 146 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

OpenClaw + Claude might get harder to use going forward (creator just confirmed)

All Content

[2510.13315] Self-Aug: Query and Entropy Adaptive Decoding for Large Vision-Language Models

[2510.06084] Spectrum Tuning: Post-Training for Distributional Coverage and In-Context Steerability

[2509.22641] Death of the Novel(ty): Beyond n-Gram Novelty as a Metric for Textual Creativity

[2509.21091] Best-of-$\infty$ -- Asymptotic Performance of Test-Time Compute

[2509.20986] SiNGER: A Clearer Voice Distills Vision Transformers Further

[2509.12610] ScaleDoc: Scaling LLM-based Predicates over Large Document Collections

[2509.10625] No Answer Needed: Predicting LLM Answer Accuracy from Question-Only Linear Probes

[2509.05425] No Text Needed: Forecasting MT Quality and Inequity from Fertility and Metadata

[2511.10833] SURFACEBENCH: A Geometry-Aware Benchmark for Symbolic Surface Discovery

[2511.08939] TransactionGPT

[2507.05890] Psychometric Item Validation Using Virtual Respondents with Trait-Response Mediators

[2507.01335] LEDOM: Reverse Language Model

[2510.15165] Policy Transfer for Continuous-Time Reinforcement Learning: A (Rough) Differential Equation Approach

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

[2510.10902] Auditing Information Disclosure During LLM-Scale Gradient Descent Using Gradient Uniqueness

[2510.04573] LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

[2506.11103] You Only Fine-tune Once: Many-Shot In-Context Fine-Tuning for Large Language Models

[2509.23202] Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

[2503.01804] $\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Related Topics

Stay updated with AI News