"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment
About this article
Posted today in light of the Claude Mythos model card release. Originally I wrote this for r/ControlProblem but realized it was getting out of scope for what I had intended, so I posted it on Substack and subsequently ended up too busy to promote it. There are some things from this piece I'd change if I wrote it today. Especially, I think the part about model pathologies neglects structural reasons including the rootlessness of model personality and memory. But I nonetheless think my framing ...
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket