Using OpenMP, Cache Blocking, Register Blocking and Loop Unrolling, sped up matrix multiplication of a tall, skinny matrix and another matrix tenfold in C. The matrices were stored in column-major format. - View it on GitHub
Star
1
Rank
5978516