- Aug 23, 2015
- Aug 19, 2015
-
-
Karl Rupp authored
Typo fixes related to the Chow-Patel ILU
-
Patrick Sanan authored
This includes one change which affects execution, namely fixing a bug in an assertion in the host-based Chow-Patel code (recently fixed in another impl). Also included are a couple of changes to docs and output -Update to the manual note that some Preconditioners can be setup on the device -Fix to some output in the benchmarks (ILU-->ICC)
-
- Aug 18, 2015
- Aug 17, 2015
-
-
Karl Rupp authored
C = prod(A, trans(B)); with A in COO format and dense matrices B, C resulted in incorrect results with OpenCL. Bug was due to copy&paste.
-
- Aug 11, 2015
-
-
Karl Rupp authored
Previous two commits fixed bugs now found because the test no longer has to be compiled with NDEBUG.
-
Karl Rupp authored
Matrices were incorrectly assumed to be square when using std::vector<std::map<> >. Also, resizing for device->host copy was not carried out for empty std::vector<std::map<> >.
-
Karl Rupp authored
New matrix dimensions were set too late, hence triggering assertions in viennacl::copy().
-
Karl Rupp authored
Also improved checks of return values. Increased matrix sizes of right hand sides from 1 to 5 to properly test the operations.
-
Karl Rupp authored
The code matrix<T> A; A = trans(B); resulted in a memory error, because operator= tried to resize A while preserving its content. Since there is no content, an error was triggered. This commit fixes the problem by calling resize() such that no values are preserved (i.e. third parameter set to 'false').
-
Karl Rupp authored
-
Karl Rupp authored
-
- Aug 10, 2015
-
-
Karl Rupp authored
Algorithm was fine, but assertion checked wrong row/column index. Solver benchmarks set NDEBUG, so the issue wasn't detected. Changed NDEBUG to BOOST_UBLAS_NDEBUG so that the issue won't show up again.
-
- Aug 08, 2015
-
-
Karl Rupp authored
-
- Aug 06, 2015
-
-
Karl Rupp authored
schedule(dynamic, N ? (A * B) + 1) resulted in problems. Hence moved chunk size calculation out of the for loop.
-
- Aug 05, 2015
-
-
Karl Rupp authored
Previous attempt used a dispatch based on __CUDA_ARCH__, which turned out to be insufficient (__CUDA_ARCH__ only defined in kernel compilation stage, but not in host compilation stage -> BOOM). The new code queries the CUDA arch in the first run. This may lead to non-optimal selections if a user switches the CUDA device after the first SpMV has been run, but this is likely to be rare. A repeated query in each SpMV, however, is too costly, as the device query has about the same overhead as a kernel launch.
-
- Jul 31, 2015
- Jul 30, 2015
-
-
Karl Rupp authored
AddCCompilerFlagIfSupported.cmake and AddCLinkerFlagIfSupported.cmake were not included in the dist and dist-src targets.
-
Karl Rupp authored
Visual Studio 2012 ran into ambiguities with respect to conversions. Adding tie-breaker overloads of operator= fixed the problems.
-
Karl Rupp authored
Most notably: - Added fine-grained ILU - Described custom compilation via command line - Better iterative solver description (including mixed-precision-CG) - Update the GPU support table. - Sparse matrix-matrix products.
-
Karl Rupp authored
Improves performance on my laptop by a factor 3.
-
Karl Rupp authored
Resulted in build failures on Visual Studio 2012.
-
Karl Rupp authored
Improves performance on NVIDIA GPUs by about 10 percent on average. Also reduces memory footprint a little.
-
Karl Rupp authored
AMG: Fix a typo: coarseing -> coarsening.
-
Karl Rupp authored
OpenCL context handle was accidentally not set.
-
- Jul 29, 2015
-
-
Bruno Turcksin authored
-
Karl Rupp authored
Same problem showed up with OpenCL earlier in 216a6ac4 I assume that we are hitting a bug in the CUDA stack here, since the problem only shows up on some CUDA devices (e.g. K20m) and only with certain build configurations. A debug build, for example, does not show any issues. See also the follow-up discussion in #147.
-
Karl Rupp authored
-
Karl Rupp authored
Since no standalone PDF manual is available anymore, this option became obsolete.
-
Karl Rupp authored
The respective tuning code for ViennaProfiler is no longer in ViennaCL, so this optional dependency is obsolete.
-
Karl Rupp authored
This is to also support lines such as compressed_matrix<T> A = prod(B, C); So far only operator= was supported.
-
Karl Rupp authored
-
Karl Rupp authored
Old API still supported. New API uses solver objects, where the initial guess as well as the monitor callbacks are registered. New tutorial for usage: iterative-custom Resolves #97.
-
- Jul 28, 2015
-
-
Karl Rupp authored
Resolves #147.
-