Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing - View it on GitHub
Star
47
Rank
415220