OpenMP: Performance optimization of amg_coarse_ag_stage1_mis2.
Removed unnecessary reads and writes from work arrays. Observed performance gains of about 10 percent on an AMD A10-5800K with four threads. Results likely to be similar on other machines. Similar optimizations can be applied to the CUDA and OpenCL backends. These will be added with a later commit.
Loading
Please register or sign in to comment