Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing - View it on GitHub
Star
48
Rank
469474