G-KV, a KV cache eviction method that employs a global scoring mechanism, combining local and historical attention scores to more accurately assess token importance. - View it on GitHub
Star
3
Rank
3343664