Skip to content
Commit d2ef9b25 authored by Karl Rupp's avatar Karl Rupp
Browse files

MIC: Enhanced kernel parameters for BLAS levels 1, 2, 3.

Parameters not obtained from a full-fledged optimizer run,
but from careful manual tweaking. Obtained memory bandwidths
for BLAS levels 1 and 2 are about 70 GB/sec, which is okay given
that vector operations on the Xeon Phi are slow with OpenCL.
BLAS level 3 improves very mildly, peaks at about 40 GFLOP/sec.

Given that OpenCL for Xeon Phi (KNC) has limited use and that
everybody is eager for KNL, further tuning efforts are suspended.

Resolves #26.
parent 45901648
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment