Skip to content

Speed up kernel comparison / loading from cache

Things working:

Things left to do:

  • Instructions should not have to be analyzed at all along the fast path from "loading from cache" to "execute".

    • Cache generation of invokers (this is also a non-trivial latency penalty). Invoker generation looks at the instructions and so prevents us from being fully lazy.
    • Kernel.copy() iterates through all instructions; add an option to disable this
  • Full use of lazy data structures

    • Add code to generate eq keys and persistent hash keys for instructions (among other things, this has to handle pymbolic expressions, and to normalize the order of sets)
    • Use LazilyUnpicklingList for the list of instructions

cc: @inducer

See also: pytential#38

Edited by Matt Wala