CUDA: Added improved CSR SpMV kernel.
Used whenever average number of nonzeros per row is larger than 6.5 (Maxwell) or 12.0 (Kepler and earlier). Overall performance about 10-20 percent better than CUSPARSE.
Loading
Please sign in to comment
Used whenever average number of nonzeros per row is larger than 6.5 (Maxwell) or 12.0 (Kepler and earlier). Overall performance about 10-20 percent better than CUSPARSE.