Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 6 hours ago

Llms

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

hey everyone, been working on this side project called ai-setup and just hit a milestone i wanted to share 150 github stars, 90 PRs merge...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

Llms

built an open source tool that auto generates AI context files for any codebase, 150 stars in

one of the most tedious parts of working with AI coding tools is having to manually write context files every single time. CLAUDE.md, .cu...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

All Content

Llms

[2510.10223] You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

Abstract page for arXiv paper 2510.10223: You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2510.04607] From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

Abstract page for arXiv paper 2510.04607: From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2502.01754] Evaluation of Large Language Models via Coupled Token Generation

Abstract page for arXiv paper 2502.01754: Evaluation of Large Language Models via Coupled Token Generation

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2211.14997] A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data and LLMs Perspective

Abstract page for arXiv paper 2211.14997: A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data and LLMs Perspective

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2601.16399] A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to LLM Fine-Tuning

Abstract page for arXiv paper 2601.16399: A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to ...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2510.14751] Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

Abstract page for arXiv paper 2510.14751: Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2507.21037] When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject Motor Imagery Decoding

Abstract page for arXiv paper 2507.21037: When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject ...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2506.06303] Reward Is Enough: LLMs Are In-Context Reinforcement Learners

Abstract page for arXiv paper 2506.06303: Reward Is Enough: LLMs Are In-Context Reinforcement Learners

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2505.16950] Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

Abstract page for arXiv paper 2505.16950: Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24472] Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Abstract page for arXiv paper 2603.24472: Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.24226] UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking

Abstract page for arXiv paper 2603.24226: UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Abstract page for arXiv paper 2603.23971: The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

Abstract page for arXiv paper 2603.23937: Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23911] Self-Distillation for Multi-Token Prediction

Abstract page for arXiv paper 2603.23911: Self-Distillation for Multi-Token Prediction

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

Abstract page for arXiv paper 2603.23914: Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient ...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23822] How Vulnerable Are Edge LLMs?

Abstract page for arXiv paper 2603.23822: How Vulnerable Are Edge LLMs?

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models

Abstract page for arXiv paper 2603.23821: Perturbation: A simple and efficient adversarial tracer for representation learning in language...

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

Abstract page for arXiv paper 2603.23800: Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt ...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning

Abstract page for arXiv paper 2603.23794: Sparse Autoencoders for Interpretable Medical Image Representation Learning

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

Abstract page for arXiv paper 2603.23668: Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language...

arXiv - Machine Learning · 3 min · 3 days ago

Previous Page 10 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

built an open source tool that auto generates AI context files for any codebase, 150 stars in

All Content

[2510.10223] You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

[2510.04607] From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

[2502.01754] Evaluation of Large Language Models via Coupled Token Generation

[2211.14997] A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data and LLMs Perspective

[2601.16399] A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to LLM Fine-Tuning

[2510.14751] Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

[2507.21037] When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject Motor Imagery Decoding

[2506.06303] Reward Is Enough: LLMs Are In-Context Reinforcement Learners

[2505.16950] Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

[2603.24472] Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

[2603.24226] UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking

[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

[2603.23911] Self-Distillation for Multi-Token Prediction

[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

[2603.23822] How Vulnerable Are Edge LLMs?

[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models

[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning

[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

Related Topics

Stay updated with AI News