- Dec 10, 2014
-
-
Karl Rupp authored
-
Karl Rupp authored
Resulted in some complaints of -fsanitize=undefined
-
Karl Rupp authored
Thanks to AddressSanitizer in GCC 4.9
-
Karl Rupp authored
-Wall -Wextra -pedantic -Wconversion. A tricky detail is that the product of two chars or shorts gets promoted to 'int', so some extra logic was required.
-
- Dec 09, 2014
-
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
Resolves #112 (together with previous commit).
-
Karl Rupp authored
This is in response to issue #112, which reports that the use of OpenMP-'variables' derived from templates is unspecified/undefined. Problems have been observed with the Fujitsu compiler.
-
Karl Rupp authored
Resulted in complaints if called from GMRES, therefore passing it in as a pointer. Should not affect performance, but verification desired.
-
Karl Rupp authored
Without these, some user code may accidentally use the solvers without pipelining, which is not what we want...
-
- Dec 06, 2014
- Dec 05, 2014
-
-
Karl Rupp authored
Used to be a reference, for which initialization with NULL doesn't work. Now using copies, which is fine due to smart-pointer semantics.
-
- Dec 04, 2014
-
-
Karl Rupp authored
Use of thread-local variables is substantially slower than using shared memory directly in this case. 2x difference on a Tesla C2050 for this particular kernel. Overall performance gains depend on sparsity pattern of the matrix (as always).
-
Karl Rupp authored
This has been a problem if one wanted to use compressed_matrix outside the default memory domain.
-
Karl Rupp authored
Results in mild (about 10 percent) performance gains.
-
Karl Rupp authored
Provides about 10 percent better performance on average for a mix of typical matrices from the Florida Sparse Matrix Collection.
- Nov 20, 2014
-
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
Based on experiments on a GTX 470. Kepler and Maxwell GPUs might behave differently.
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
Warnings were due to conversion of floats to bools.
-
- Nov 19, 2014
-
-
Karl Rupp authored
-
Karl Rupp authored
See paper "Efficient Sparse Matrix-Vector Multiplication on GPUs using the CSR Storage Format" by Greathouse and Daga, presented at SC 2014.
-
Karl Rupp authored
Problems were only found in the matrix-times-matrix case. Resolves #2.
-
Karl Rupp authored
Now a user can directly provide std::vector< std::map<IndexType, NumericType> > to populate the sparse matrix. This was already possible for the other sparse matrix types.
-
- Nov 17, 2014
-
-
Karl Rupp authored
Resolves #6. A little bit of input-dependent tuning is certainly possible, yet the overall control flow is now fixed. Performance largely dependent on the performance of matrix-vector and matrix-matrix products, respectively.
-
- Nov 16, 2014
-
-
Karl Rupp authored
Disable coveralls. Takes too long to run and hence fails.
-
Karl Rupp authored
Old generator tests and OpenCL random number generation. Both superseded.
-
Dominic Meiser authored
lcov takes much too long making the travis builds time out and fail.
-