Device-Specific/ GEMM : Removed #pragma unroll for the KL loop
It seems to have a disastrous effect on AMD GPUs. Perhaps this pragma is beneficial on intel, though. We might consider to disable unrolling only if AMD is used. To investigate...
Loading
Please register or sign in to comment