Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
Llms

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

hey everyone, been working on this side project called ai-setup and just hit a milestone i wanted to share 150 github stars, 90 PRs merge...

Reddit - Artificial Intelligence · 1 min ·
Llms

built an open source tool that auto generates AI context files for any codebase, 150 stars in

one of the most tedious parts of working with AI coding tools is having to manually write context files every single time. CLAUDE.md, .cu...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2510.10223] You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs
Llms

[2510.10223] You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

Abstract page for arXiv paper 2510.10223: You only need 4 extra tokens: Synergistic Test-time Adaptation for LLMs

arXiv - Machine Learning · 4 min ·
[2510.04607] From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents
Llms

[2510.04607] From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

Abstract page for arXiv paper 2510.04607: From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

arXiv - Machine Learning · 4 min ·
[2502.01754] Evaluation of Large Language Models via Coupled Token Generation
Llms

[2502.01754] Evaluation of Large Language Models via Coupled Token Generation

Abstract page for arXiv paper 2502.01754: Evaluation of Large Language Models via Coupled Token Generation

arXiv - Machine Learning · 4 min ·
[2211.14997] A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data and LLMs Perspective
Llms

[2211.14997] A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data and LLMs Perspective

Abstract page for arXiv paper 2211.14997: A Comprehensive Survey on Enterprise Financial Risk Analysis from Big Data and LLMs Perspective

arXiv - Machine Learning · 4 min ·
[2601.16399] A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to LLM Fine-Tuning
Llms

[2601.16399] A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to LLM Fine-Tuning

Abstract page for arXiv paper 2601.16399: A Hessian-Free Actor-Critic Algorithm for Bi-Level Reinforcement Learning with Applications to ...

arXiv - Machine Learning · 4 min ·
[2510.14751] Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries
Llms

[2510.14751] Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

Abstract page for arXiv paper 2510.14751: Beyond Multi-Token Prediction: Pretraining LLMs with Future Summaries

arXiv - Machine Learning · 3 min ·
[2507.21037] When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject Motor Imagery Decoding
Llms

[2507.21037] When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject Motor Imagery Decoding

Abstract page for arXiv paper 2507.21037: When Brain Foundation Model Meets Cauchy-Schwarz Divergence: A New Framework for Cross-Subject ...

arXiv - Machine Learning · 4 min ·
[2506.06303] Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Llms

[2506.06303] Reward Is Enough: LLMs Are In-Context Reinforcement Learners

Abstract page for arXiv paper 2506.06303: Reward Is Enough: LLMs Are In-Context Reinforcement Learners

arXiv - Machine Learning · 4 min ·
[2505.16950] Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning
Llms

[2505.16950] Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

Abstract page for arXiv paper 2505.16950: Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning

arXiv - Machine Learning · 4 min ·
[2603.24472] Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?
Llms

[2603.24472] Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Abstract page for arXiv paper 2603.24472: Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

arXiv - Machine Learning · 3 min ·
[2603.24226] UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking
Llms

[2603.24226] UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking

Abstract page for arXiv paper 2603.24226: UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking

arXiv - Machine Learning · 4 min ·
[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More
Llms

[2603.23971] The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

Abstract page for arXiv paper 2603.23971: The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

arXiv - Machine Learning · 4 min ·
[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development
Llms

[2603.23937] Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

Abstract page for arXiv paper 2603.23937: Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development

arXiv - Machine Learning · 4 min ·
[2603.23911] Self-Distillation for Multi-Token Prediction
Llms

[2603.23911] Self-Distillation for Multi-Token Prediction

Abstract page for arXiv paper 2603.23911: Self-Distillation for Multi-Token Prediction

arXiv - Machine Learning · 3 min ·
[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding
Llms

[2603.23914] Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding

Abstract page for arXiv paper 2603.23914: Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient ...

arXiv - Machine Learning · 4 min ·
[2603.23822] How Vulnerable Are Edge LLMs?
Llms

[2603.23822] How Vulnerable Are Edge LLMs?

Abstract page for arXiv paper 2603.23822: How Vulnerable Are Edge LLMs?

arXiv - Machine Learning · 3 min ·
[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models
Llms

[2603.23821] Perturbation: A simple and efficient adversarial tracer for representation learning in language models

Abstract page for arXiv paper 2603.23821: Perturbation: A simple and efficient adversarial tracer for representation learning in language...

arXiv - Machine Learning · 3 min ·
[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection
Llms

[2603.23800] Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt Selection

Abstract page for arXiv paper 2603.23800: Object Search in Partially-Known Environments via LLM-informed Model-based Planning and Prompt ...

arXiv - Machine Learning · 4 min ·
[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning
Llms

[2603.23794] Sparse Autoencoders for Interpretable Medical Image Representation Learning

Abstract page for arXiv paper 2603.23794: Sparse Autoencoders for Interpretable Medical Image Representation Learning

arXiv - Machine Learning · 3 min ·
[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models
Llms

[2603.23668] Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language Models

Abstract page for arXiv paper 2603.23668: Energy Efficient Software Hardware CoDesign for Machine Learning: From TinyML to Large Language...

arXiv - Machine Learning · 3 min ·
Previous Page 10 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime