Error in distributed FMM
The snippet can be accessed without any authentication.
Authored by
Matt Wala
Applying this patch gets the following results
$ mpiexec -n 2 python test_distributed.py
INFO:boxtree.tree_build_kernels:start building tree build kernels
INFO:boxtree.tree_build_kernels:tree build kernels built
INFO:boxtree.tree_build:tree build: start
INFO:boxtree.tree_build:LEVEL 1 -> 5 boxes
INFO:boxtree.tree_build:LEVEL 2 -> 21 boxes
INFO:boxtree.tree_build:LEVEL 3 -> 69 boxes
INFO:boxtree.tree_build:LEVEL 4 -> 221 boxes
INFO:boxtree.tree_build:LEVEL 5 -> 661 boxes
INFO:boxtree.tree_build:LEVEL 6 -> 2005 boxes
INFO:boxtree.tree_build:LEVEL 7 -> 5773 boxes
INFO:boxtree.tree_build:LEVEL 8 -> 14505 boxes
INFO:boxtree.tree_build:LEVEL 9 -> 18901 boxes
INFO:boxtree.tree_build:elapsed time: 0.318168 s (1.59084e-07 s/particle/pass)
INFO:boxtree.tree_build:18890 boxes after pruning (11 empty leaves and/or unused boxes removed)
INFO:boxtree.tree_build:tree build complete
Generate tree 0.6774728298187256
INFO:boxtree.traversal:traversal build kernels: start build
INFO:boxtree.traversal:traversal build kernels: done
INFO:boxtree.traversal:start building traversal
INFO:boxtree.traversal:traversal built
Generate traversal 0.35829854011535645
INFO:boxtree.fmm:start fmm
INFO:boxtree.fmm:fmm complete
Shared memory FMM 3.681605100631714
INFO:loopy.kernel.creation:(unnamed): kernel creation start
INFO:loopy.kernel.creation:loopy_kernel: kernel creation done
Generate local tree 2.6324493885040283
Generate local tree 2.635507822036743
INFO:boxtree.traversal:traversal build kernels: start build
INFO:boxtree.traversal:traversal build kernels: done
INFO:boxtree.traversal:start building traversal
INFO:boxtree.traversal:traversal built
INFO:boxtree.traversal:start building traversal
INFO:boxtree.traversal:traversal built
Generate local trav 0.48515748977661133
INFO:pyopencl:build program: kernel 'elwise_kernel' was part of a lengthy cache retrieval (1.01 s)
INFO:boxtree.traversal:traversal build kernels: start build
INFO:boxtree.traversal:traversal build kernels: done
INFO:boxtree.traversal:start building traversal
INFO:boxtree.traversal:traversal built
INFO:boxtree.traversal:start building traversal
INFO:boxtree.traversal:traversal built
Generate local trav 1.4418339729309082
Communication: 0.9604287147521973
Communication: 0.0030205249786376953
List 1: 0.6205658912658691
List 1: 0.7522716522216797
List 3: 0.44483160972595215
List 3: 0.7263567447662354
Distributed FMM 3.232790470123291
INFO:boxtree.distributed:fmm complete
Distributed FMM 2.288548469543457
Total time 6.366271257400513
1.45519152284e-11
unstable.patch 1.32 KiB
diff --git a/boxtree/distributed.py b/boxtree/distributed.py
index 3e06fe9..5954e7f 100644
--- a/boxtree/distributed.py
+++ b/boxtree/distributed.py
@@ -45,7 +45,7 @@ print("Process %d of %d on %s with ctx %s.\n" % (
queue.context.devices))
-COMMUNICATE_MPOLES_VIA_ALLREDUCE = False
+COMMUNICATE_MPOLES_VIA_ALLREDUCE = 1
class LocalTree(Tree):
diff --git a/test/test_distributed.py b/test/test_distributed.py
index 212e623..6dc1147 100644
--- a/test/test_distributed.py
+++ b/test/test_distributed.py
@@ -11,8 +11,8 @@ logging.basicConfig(level=logging.INFO)
# Parameters
dims = 2
-nsources = 10000
-ntargets = 10000
+nsources = 100000
+ntargets = 100000
dtype = np.float64
# Get the current rank
@@ -25,7 +25,7 @@ sources_weights = None
wrangler = None
-ORDER = 3
+ORDER = 8
HELMHOLTZ_K = 0
@@ -77,8 +77,8 @@ if rank == 0:
# Build the tree and interaction lists
from boxtree import TreeBuilder
tb = TreeBuilder(ctx)
- tree, _ = tb(queue, sources, targets=targets, target_radii=target_radii,
- stick_out_factor=0.25, max_particles_in_box=30, debug=True)
+ tree, _ = tb(queue, sources, targets=targets, target_radii=0*target_radii,
+ stick_out_factor=0., max_particles_in_box=30, debug=True)
now = time.time()
print("Generate tree " + str(now - last_time))
Please register or sign in to comment