Code for the paper "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023) - View it on GitHub
Star
57
Rank
463652