[2604.08563] Temperature-Dependent Performance of Prompting Strategies

[2604.08563] Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models

arXiv - AI April 13, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.08563: Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models

Computer Science > Computation and Language arXiv:2604.08563 (cs) [Submitted on 18 Mar 2026] Title:Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models Authors:Mousa Salah, Amgad Muneer View a PDF of the paper titled Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models, by Mousa Salah and Amgad Muneer View PDF HTML (experimental) Abstract:Extended reasoning models represent a transformative shift in Large Language Model (LLM) capabilities by enabling explicit test-time computation for complex problem solving. However, the optimal configuration of sampling temperature and prompting strategy for these systems remains largely underexplored. We systematically evaluate chain-of-thought and zero-shot prompting across four temperature settings (0.0, 0.4, 0.7, and 1.0) using Grok-4.1 with extended reasoning on 39 mathematical problems from AMO-Bench, a challenging International Mathematical Olympiad-level benchmark. We find that zero-shot prompting achieves peak performance at moderate temperatures, reaching 59% accuracy at T=0.4 and T=0.7, while chain-of-thought prompting performs best at the temperature extremes. Most notably, the benefit of extended reasoning increases from 6x at T=0.0 to 14.3x at T=1.0. These results suggest that temperature should be optimized jointly with prompting strategy, challenging the common practice of using T=0 for reasoning tasks. Comments: Subjects: Co...

Originally published on April 13, 2026. Curated by AI News.

Llms

Transformer Math Explorer [P]

This is an interactive math reference for transformer models, presented via dataflow graphs, all the way down to elementary math. Covers ...

Reddit - Machine Learning · 1 min · 3 minutes ago

Llms

Spotify wants to become the home for AI-generated personal audio | TechCrunch

Users will be able to create a podcast from Codex or Claude Code and import it to Spotify

TechCrunch - AI · 3 min · 3 minutes ago

Llms

We built something ChatGPT doesn't do — AI that delivers results, not answers

Most AI gives you text. We built cards. Here's what I mean. When you ask LookMood Agent to find you a job, you don't get advice on where ...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

I am not an "anti" like this guy, but still an interesting video of person interacting with chat 4o

(Posting Here because removed by Chatgpt Complaints moderators because the model here is 4o, and refuse to believe there were any safety ...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

[2604.08563] Temperature-Dependent Performance of Prompting Strategies in Extended Reasoning Large Language Models

About this article

Related Articles

Transformer Math Explorer [P]

Spotify wants to become the home for AI-generated personal audio | TechCrunch

We built something ChatGPT doesn't do — AI that delivers results, not answers

I am not an "anti" like this guy, but still an interesting video of person interacting with chat 4o

No comments

Stay updated with AI News