Skip to content
Commit 216a6ac4 authored by Karl Rupp's avatar Karl Rupp
Browse files

SpGEMM: Workaround for bug in NVIDIA OpenCL compiler.

if (buffer_size == get_local_size(0)) { ... } block caused problems
with NVIDIA drivers 34x.yz. Reproducing the error on simpler kernels
was not possible.
By moving operations on index_in_C and buffer_size out of the block,
the issues get resolved.

Also introduces use of thread-private variable 'local_id' to replace
uses of get_local_id(0) in same kernel.
Might improve performance slightly.
parent b3a6f0e1
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment