Skip to content
Commit 6195af1d authored by Philippe Tillet's avatar Philippe Tillet
Browse files

Device-Specific/ GEMM: Now using floatn* instead of float* + vload.

The fallback kernel has to be used whenever simd_width>1 && (ldstartA % simd_width > 0 || ldstartB % simd_width > 0).
This is quite a mess, but thanks to this commit it'll be easier to see if vstore(), vload() causes any performance regression, and use floatn* everywhere if this is the case
parent ecc5c609
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment