PCCX is an open NPU architecture for memory-bound Transformer inference on edge FPGAs, focused on GEMM/GEMV, KV-cache, W4A8 quantization, and custom ISA scheduling. - View it on GitHub
Star
0
Rank
14047817