Transformer Math Explorer [P]
About this article
This is an interactive math reference for transformer models, presented via dataflow graphs, all the way down to elementary math. Covers models from GPT-2 to Qwen 3.6, with MLA, MoE, RoPE, MTP, hybrid attention, and other variants toggleable. Originally made this for myself to keep track of all the variations. If you find errors or find something unintuitive or misleading let me know! submitted by /u/simonramstedt [link] [comments]
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket