[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request
About this article
I'm looking to work with people interested in math, machine learning, or agentic coding, on creating a multi-agent framework to do frontier math research. My core idea is to build the transformer architecture out of Claudes. The Architecture (Q/K/V Attention) concretely, the setup consists of a single claude controlling a team of claudes through the agent teams feature. The team leader is the “attention head” claude, and the team members are the mlp claudes. ClaudeFormer maps precisely to the...
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket