Faster 3D M2Ls using precomputed rotation matrices (!79) · Merge requests · Andreas Klöckner / boxtree

Matt Wala requested to merge fast-m2l into master May 31, 2019

This change uses the mplocquadu2_trunc version of the M2L translation operators from fmmlib. These operators use a precomputed rotation matrix to speed up the rotations for the "point and shoot" translation.

To obtain the rotation matrix, the traversal gets taught to compute the rotation angles that are necessary for the M2L translation. This is done by recording the translation vectors when List 2 is built, and then computing the "rotation class" for each translation vector.

The FMMLIB wrangler gains a new optional "geometry data" parameter which supplies the rotation classes for List 2. The wrangler uses this information to precompute rotation matrices when doing the M2L. There is a memory cutoff threshold beyond which we revert to the regular version as the matrices can get quite large.

Some timing results on my laptop, taken from the test test_fmm_with_optimized_3d_m2l:

Laplace, 10^4 sources and 10^4 targets:

Order 10:

Baseline M2L time : 6.351 s
Optimized M2L time: 3.327 s

Order 20:

Baseline M2L time : 34.12 s
Optimized M2L time: 19.19 s

Helmholtz, 10^4 sources and 10^4 targets:

Order 10:

Baseline M2L time : 22.36 s
Optimized M2L time: 19.63 s

Order 20:

Baseline M2L time : 142 s
Optimized M2L time: 130.9 s

Point requirements.txt back to pyfmmlib master after pyfmmlib!13 (merged) is merged
Run pytential against this, make sure CI passes (pytential!152 (merged), pytential!161 (closed))

Edited Jul 09, 2019 by Matt Wala

Faster 3D M2Ls using precomputed rotation matrices

Merge request reports