Early proof of concept. Operationalizing trust and safety policies as evals at scale (using model-written examples). - View it on GitHub
Star
1
Rank
4944926