ILU: Added host-based parallel ILU based on paper by Chow and Patel, SISC 2015.
CUDA and OpenCL backends will follow as separate commits. Comments: - Convergence looks reasonable, yet additional component tests desired (triangular solve, ILU, etc.) - Matrix transposition only single-threaded, can be a bottleneck (less severe than for AMG, though...) - Best performance requires experimentation with number of sweeps and Jacobi iterations. Values between 1 and 4 look reasonable.
Loading
Please register or sign in to comment