Llms Machine Learning Ai Infrastructure

The open-source AI system that beat Claude Sonnet on a $500 GPU just shipped a coding assistant

Reddit - Artificial Intelligence April 06, 2026 1 min read

About this article

A week or two ago, an open-source project called ATLAS made the rounds for scoring 74.6% on LiveCodeBench with a frozen 9B model on a single consumer GPU- outperforming Claude Sonnet 4.5 (71.4%). As I was watching it make the rounds, a common response was that it was either designed around a benchmark or that it could never work in a real codebase- and I agreed. Well, V3.0.1 just shipped, and it proved me completely wrong. The same verification pipeline that scored 74.6% now runs as a full co...

You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket

Originally published on April 06, 2026. Curated by AI News.

Read Original Article

Llms

94.42% on BANKING77 Official Test Split — New Strong 2nd Place with Lightweight Embedding + Rerank (no 7B LLM)

94.42% Accuracy on Banking77 Official Test Split BANKING77-77 is deceptively hard: 77 fine-grained banking intents, noisy real-world quer...

Reddit - Artificial Intelligence · 1 min · 1 minute ago

Llms

[D] Tested model routing on financial AI datasets — good savings and curious what benchmarks others use.

Ran a benchmark evaluating whether prompt complexity-based routing delivers meaningful savings. Used public HuggingFace datasets. Here's ...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

[D] AI research on small language models

i'm doing research on some trending fields in AI, currently working on small language models and would love to meet people who are workin...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

One of The Worst AI's I've Ever Seen

I'm using Gemini just for they gave us a student-free-pro pack. It can't see the images I sent, most of the time it just rewrites the mes...

The open-source AI system that beat Claude Sonnet on a $500 GPU just shipped a coding assistant

About this article

Related Articles

94.42% on BANKING77 Official Test Split — New Strong 2nd Place with Lightweight Embedding + Rerank (no 7B LLM)

[D] Tested model routing on financial AI datasets — good savings and curious what benchmarks others use.

[D] AI research on small language models

One of The Worst AI's I've Ever Seen

No comments

Stay updated with AI News