Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing - View it on GitHub
Star
50
Rank
466753