gmh5225/LLMLingua - Gitstar Ranking

gmh5225

Fetched on 2026/05/08 11:55

[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss. - View it on GitHub

https://llmlingua.com/

Star

Rank

13993518

gmh5225

gmh5225 / LLMLingua