Commit 66a8949c authored May 10, 2015 by Karl Rupp

Merge branch 'karlrupp/sparse-matrix-matrix-product'

karlrupp/sparse-matrix-matrix-product:
 Fast implementations of sparse matrix-matrix products.
 About 1.5x faster than MKL on Haswell if AVX2 enabled.
 About 1.5x faster than CUSP and CUBLAS on NVIDIA GPUs.
 About the same performnace on MIC.
 Faster on FirePro W9100 with OpenCL than on a Tesla K20m with CUDA.
 A few more tweaks possible, but will be applied in a separate feature branch.

parents f29e01e0 b3e5daa0

Show whitespace changes

Inline Side-by-side

Please to comment