carlini/yet-another-applied-llm-benchmark

carlini

Fetched on 2026/07/13 16:40

A benchmark to evaluate language models on questions I've previously asked them to solve. - View it on GitHub

Star

1062

Rank

40487