Large Language Models

GPT, Claude, Gemini, and other LLMs

Top This Week

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
Llms

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

hey everyone, been working on this side project called ai-setup and just hit a milestone i wanted to share 150 github stars, 90 PRs merge...

Reddit - Artificial Intelligence · 1 min ·
Llms

built an open source tool that auto generates AI context files for any codebase, 150 stars in

one of the most tedious parts of working with AI coding tools is having to manually write context files every single time. CLAUDE.md, .cu...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load
Llms

[2603.23640] LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustained Load

Abstract page for arXiv paper 2603.23640: LLM Inference at the Edge: Mobile, NPU, and GPU Performance Efficiency Trade-offs Under Sustain...

arXiv - Machine Learning · 4 min ·
[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models
Llms

[2603.23611] LLMORPH: Automated Metamorphic Testing of Large Language Models

Abstract page for arXiv paper 2603.23611: LLMORPH: Automated Metamorphic Testing of Large Language Models

arXiv - Machine Learning · 4 min ·
[2603.23576] Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM
Llms

[2603.23576] Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM

Abstract page for arXiv paper 2603.23576: Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM

arXiv - Machine Learning · 3 min ·
[2603.23539] PLDR-LLMs Reason At Self-Organized Criticality
Llms

[2603.23539] PLDR-LLMs Reason At Self-Organized Criticality

Abstract page for arXiv paper 2603.23539: PLDR-LLMs Reason At Self-Organized Criticality

arXiv - Machine Learning · 3 min ·
[2603.23533] MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG
Llms

[2603.23533] MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High-Accuracy RAG

Abstract page for arXiv paper 2603.23533: MDKeyChunker: Single-Call LLM Enrichment with Rolling Keys and Key-Based Restructuring for High...

arXiv - Machine Learning · 4 min ·
[2603.23530] Did You Forget What I Asked? Prospective Memory Failures in Large Language Models
Llms

[2603.23530] Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

Abstract page for arXiv paper 2603.23530: Did You Forget What I Asked? Prospective Memory Failures in Large Language Models

arXiv - Machine Learning · 3 min ·
[2603.23514] DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models
Llms

[2603.23514] DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Language Models

Abstract page for arXiv paper 2603.23514: DepthCharge: A Domain-Agnostic Framework for Measuring Depth-Dependent Knowledge in Large Langu...

arXiv - Machine Learning · 4 min ·
[2603.23507] Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes
Llms

[2603.23507] Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

Abstract page for arXiv paper 2603.23507: Beyond Masks: Efficient, Flexible Diffusion Language Models via Deletion-Insertion Processes

arXiv - Machine Learning · 4 min ·
[2603.24562] Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction
Llms

[2603.24562] Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction

Abstract page for arXiv paper 2603.24562: Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction

arXiv - Machine Learning · 4 min ·
[2603.24533] UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience
Llms

[2603.24533] UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Abstract page for arXiv paper 2603.24533: UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

arXiv - Machine Learning · 4 min ·
[2603.24518] TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models
Llms

[2603.24518] TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

Abstract page for arXiv paper 2603.24518: TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models

arXiv - Machine Learning · 4 min ·
[2603.24517] AVO: Agentic Variation Operators for Autonomous Evolutionary Search
Llms

[2603.24517] AVO: Agentic Variation Operators for Autonomous Evolutionary Search

Abstract page for arXiv paper 2603.24517: AVO: Agentic Variation Operators for Autonomous Evolutionary Search

arXiv - Machine Learning · 4 min ·
[2603.24511] Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs
Llms

[2603.24511] Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

Abstract page for arXiv paper 2603.24511: Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs

arXiv - Machine Learning · 4 min ·
[2603.24382] MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization
Llms

[2603.24382] MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

Abstract page for arXiv paper 2603.24382: MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization

arXiv - Machine Learning · 3 min ·
[2603.24324] Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning
Llms

[2603.24324] Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning

Abstract page for arXiv paper 2603.24324: Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforce...

arXiv - Machine Learning · 4 min ·
[2603.24275] Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers
Llms

[2603.24275] Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers

Abstract page for arXiv paper 2603.24275: Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Sem...

arXiv - Machine Learning · 3 min ·
[2603.24202] A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula
Llms

[2603.24202] A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

Abstract page for arXiv paper 2603.24202: A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula

arXiv - Machine Learning · 4 min ·
[2603.24126] Likelihood hacking in probabilistic program synthesis
Llms

[2603.24126] Likelihood hacking in probabilistic program synthesis

Abstract page for arXiv paper 2603.24126: Likelihood hacking in probabilistic program synthesis

arXiv - Machine Learning · 3 min ·
[2603.24124] The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation
Llms

[2603.24124] The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation

Abstract page for arXiv paper 2603.24124: The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty...

arXiv - Machine Learning · 4 min ·
[2603.24093] Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization
Llms

[2603.24093] Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization

Abstract page for arXiv paper 2603.24093: Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization

arXiv - Machine Learning · 4 min ·
Previous Page 11 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime