Large Language Models

GPT, Claude, Gemini, and other LLMs

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 8 hours ago

Llms

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

hey everyone, been working on this side project called ai-setup and just hit a milestone i wanted to share 150 github stars, 90 PRs merge...

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

Llms

built an open source tool that auto generates AI context files for any codebase, 150 stars in

one of the most tedious parts of working with AI coding tools is having to manually write context files every single time. CLAUDE.md, .cu...

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

All Content

Llms

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load

Abstract page for arXiv paper 2603.23640: LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustain...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models

Abstract page for arXiv paper 2603.23611: LLMORPH: Automated Metamorphic Testing of Large Language Models

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23576] Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM

Abstract page for arXiv paper 2603.23576: Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23539] PLDR-LLMs Reason At Self-Organized Criticality

Abstract page for arXiv paper 2603.23539: PLDR-LLMs Reason At Self-Organized Criticality

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23533] MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG

Abstract page for arXiv paper 2603.23533: MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23530] Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

Abstract page for arXiv paper 2603.23530: Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.23514] DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

Abstract page for arXiv paper 2603.23514: DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Langu...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.23507] Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

Abstract page for arXiv paper 2603.23507: Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24562] Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction

Abstract page for arXiv paper 2603.24562: Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24533] UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Abstract page for arXiv paper 2603.24533: UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24518] TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

Abstract page for arXiv paper 2603.24518: TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24517] AVO: Agentic Variation Operators for Autonomous Evolutionary Search

Abstract page for arXiv paper 2603.24517: AVO: Agentic Variation Operators for Autonomous Evolutionary Search

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24511] Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

Abstract page for arXiv paper 2603.24511: Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24382] MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

Abstract page for arXiv paper 2603.24382: MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.24324] Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning

Abstract page for arXiv paper 2603.24324: Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforce...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24275] Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers

Abstract page for arXiv paper 2603.24275: Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Sem...

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.24202] A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

Abstract page for arXiv paper 2603.24202: A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24126] Likelihood hacking in probabilistic program synthesis

Abstract page for arXiv paper 2603.24126: Likelihood hacking in probabilistic program synthesis

arXiv - Machine Learning · 3 min · 3 days ago

Llms

[2603.24124] The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation

Abstract page for arXiv paper 2603.24124: The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty...

arXiv - Machine Learning · 4 min · 3 days ago

Llms

[2603.24093] Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization

Abstract page for arXiv paper 2603.24093: Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization

arXiv - Machine Learning · 4 min · 3 days ago

Previous Page 11 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Large Language Models

Top This Week

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

built an open source tool that auto generates AI context files for any codebase, 150 stars in

All Content

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load

[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models

[2603.23576] Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM

[2603.23539] PLDR-LLMs Reason At Self-Organized Criticality

[2603.23533] MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG

[2603.23530] Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

[2603.23514] DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

[2603.23507] Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

[2603.24562] Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction

[2603.24533] UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

[2603.24518] TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

[2603.24517] AVO: Agentic Variation Operators for Autonomous Evolutionary Search

[2603.24511] Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

[2603.24382] MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

[2603.24324] Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning

[2603.24275] Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers

[2603.24202] A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

[2603.24126] Likelihood hacking in probabilistic program synthesis

[2603.24124] The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation

[2603.24093] Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization

Related Topics

Stay updated with AI News