Skip to content

making kernel is slow

Hi,

It seems that making kernel is quite slow in loopy when there are many instructions. Here I have the so-called Holzapfel kernel generated by firedrake/tsfc, which has 8k instructions. lp.make_kernel() takes around 55 seconds on my machine. Profiling it seems to indicate that most of the time is spent in instruction.uniquify_instruction_ids and creation.resolve_dependencies (which calls fnmatch 64m times in the double loop through instructions).

Are there any suggestions/recommended practices in mitigating this? (More than happy to help if there's implementation to do of course)

Please see the script below and the pickled objects uploaded.

Many thanks!

-TJ

P.S. code generation only takes 6 seconds on this.

import pickle
import loopy as lp

data = pickle.load(open("data.file", "rb"))
domains = pickle.load(open("domains.file", "rb"))
instructions = pickle.load(open("instructions.file", "rb"))

knl = lp.make_kernel(domains, instructions, data, name="test", target=lp.CTarget(), seq_dependencies=True)

data.file

domains.file

instructions.file