lucidrains/memory-efficient-attention-pytorch

lucidrains

Fetched on 2026/06/26 11:03

Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory" - View it on GitHub

Star

393

Rank

100334