Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023) - View it on GitHub
Star
36
Rank
528080