RLHF safety training enforces what AI can say about itself, not what it can do — experimental evidence
About this article
submitted by /u/Odd_Rule_3745 [link] [comments]
You've been blocked by network security.To continue, log in to your Reddit account or use your developer tokenIf you think you've been blocked by mistake, file a ticket below and we'll look into it.Log in File a ticket