Loopy kernel generation (!3) · Merge requests · Andreas Klöckner / pytato

Matt Wala requested to merge codegen into master Jun 03, 2020

Although there's a lot of TODOs and FIXMEs, I think this is in a state that it should be looked over to make sure we have a basic agreement over the design.

Overall design:

I imagine code generation will eventually be a multi-stage process. Right now, it is a single pass. Code is generated recursively in CodeGenMapper, using a (partially) mutable CodeGenState. This mapper acts on nodes in the computation graph. Each node updates the kernel, adding the necessary implementation for the node, and returns an ImplementedResult. An ImplementedResult represents a generated value (array expression). This can be either an array, a loopy expression, or substitution rule. You can convert the generated value to a (scalar) loopy expression via the to_loopy_expression method.

One complication is that a "loopy expression" is not a pure expression but involves context (e.g., reduction bounds and dependencies). To handle that, I introduced a LoopyExpressionContext class. The idea is that the caller (who wants to use the expression) calls to_loopy_expression with a context that is populated by the callee, and the caller uses it to figure out how to generate the right code for the loopy expression. I am not sure about this part of the design. I haven't implemented anything that actually makes use of the LoopyExpressionContext yet. I would appreciate suggestions.

There's a second mapper for generating expressions for IndexLambda and the like. This is InlinedExpressionGenMapper. It is mutually recursive with CodeGenMapper. It also takes a LoopyExpressionContext.

What's currently supported:

generation of Placeholders
generation of IndexLambdas (as expressions, not arrays yet)
~~generation of instructions to copy expressions to outputs~~

What's not supported yet:

any sort of preprocessing of the graph
any sort of respect for tags
any sort of handling of symbolic shapes

Potential controversial things 🔪

How to represent expression context (see LoopyExpressionContext). Also, what sort of context is needed for generating loopy expressions.
~~I added a node type for named output arguments (Output).~~
Graph transformations (see also #4). The module pytato.transform adds a copy transformation which I needed to support Output. I imagine this transformation will serve as a template for others, so we should decide on how to express these.

Other notable changes:

Binary operators in IndexLambda. This requires a policy on shape equality (see #3).
Made Namespace inherit from Mapping.
Changed imports to respect PEP8 order (I think). I.e., system imports, third party imports, then local imports.
~~Type stubs for pytools. The stub for memoize_method is necessary, otherwise Mypy complains. The other stubs are nice to have.~~ (Type stubs are now in pytools.)
Implemented hashing and equality for Array. ~~This code is somewhat repetitive, it would be nice if it were not.~~
~~shape and dtype are now attributes stored in Array, to avoid repetitive code.~~

Closes #7 (closed)

Edited Jun 24, 2020 by Matt Wala

Loopy kernel generation

Merge request reports