Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences" - View it on GitHub
Star
71
Rank
305555