Generator: Changed index type from uint to size_t for GEMM.
This improves sGEMM performance on AMD Hawaii by about a factor of 2 with more recent drivers, which now use 64bit addressing. Performance in double precision increases mildly (about 10 percent). No performance change on NVIDIA devices observed.
Loading
Please register or sign in to comment