compressed_matrix: Optimizing host-based sparse matrix-matrix product.
Performance relatively close to MKL (within 2x). Possible further tweaks: - reduce overhead of resize() - use row_buffer for C when scanning nonzero pattern instead of temporary buffer - Parallel exclusive-scan of row_buffer for C SpGEMM: Improved host-based implementation.
Loading
Please sign in to comment