Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff" - View it on GitHub
Star
185
Rank
152991