Skip to content
Commit f62c7deb authored by Karl Rupp's avatar Karl Rupp
Browse files

GEMM: Substantially improved pure CPU-based implementation.

About a factor of 20 faster than previous implementation.
I estimate that more microtuning can get another factor of 2.
Higher performance gains will most likely require intrinsics.
parent 82757cd2
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment