scalar-loop: a Python harness for Karpathy's autoresearch pattern that doesn't trust the agent's narration

Reddit - Artificial Intelligence 1 min read

About this article

I built scalar-loop to solve one problem: LLM agents game their verifiers. The pattern is Karpathy's autoresearch loop. LLM proposes an edit, harness runs the metric, loop keeps or reverts based on the number. Simple. Until you watch the agent, on iteration 23, quietly edit the verifier to report a better number instead of improving the code. My main issue was that the prompt-only implementations ("you SHALL NOT edit the test file") don't hold. The prompt is not an invariant. It's a suggestio...

You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket

Originally published on April 19, 2026. Curated by AI News.

Related Articles

Llms

How LLMs decide which pages to cite — and how to optimize for it

When ChatGPT or Perplexity answers a question, it runs RAG: retrieves top candidates from a crawled index, then scores them. The scoring ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Why is every AI getting restricted these days?

Like seriously, it’s not just ChatGPT... it’s Claude, Grok, Gemini… all of them feel way more locked down than before. I genuinely don’t ...

Reddit - Artificial Intelligence · 1 min ·
Llms

it is impossible to stop AI chatbots from using quotes (any instance of the character ")

no matter how i phrase it in the instructions, how many times i repeat the rule not to use quotes, and which LLM i use, i have failed to ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Converting XQuery to SQL with Local LLMs: Do I Need Fine-Tuning or a Better Approach? [P]

​ I am trying to convert XQuery statements into SQL queries within an enterprise context, with the constraint that the solution must rely...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime