SpGEMM: Fixed missing barrier in OpenCL kernels.
The current kernels only worked for true lock-step execution. On the CPU, where each work group is executed by a few threads, an additional barrier is required for a correct execution. Should also fix problems on some NVIDIA GPUs.
Loading
Please register or sign in to comment