diff --git a/doc/guide.rst b/doc/guide.rst deleted file mode 100644 index 026f96c9777cd5807dd950188be189e094e7ad5a..0000000000000000000000000000000000000000 --- a/doc/guide.rst +++ /dev/null @@ -1,10 +0,0 @@ -.. _guide: - -What can loopy do? -================== - -This will become an example-based guide to what loopy can do. - -Loopy's Representation of a Kernel ----------------------------------- - diff --git a/doc/index.rst b/doc/index.rst index 4443e82c7281b3742368b6678ab38a6487dd8218..da936cb96441787a2b6bf5bbf77110158aef2d78 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -27,7 +27,7 @@ When you run this script, the following kernel is generated, compiled, and execu .. toctree:: :maxdepth: 2 - guide + tutorial reference misc diff --git a/doc/reference.rst b/doc/reference.rst index f2dad9056d2085213af6458c8eb2aac9b9fa78ca..f84e7865bf782b5e7a2c94085a5e37d4fceb1c6a 100644 --- a/doc/reference.rst +++ b/doc/reference.rst @@ -1,3 +1,5 @@ +.. _reference: + Reference Guide =============== @@ -6,7 +8,7 @@ Reference Guide This guide defines all functionality exposed by loopy. If you would like a more gentle introduction, you may consider reading the example-based -guide :ref:`guide` instead. +:ref:`tutorial` instead. .. _inames: @@ -14,7 +16,7 @@ Inames ------ Loops are (by default) entered exactly once. This is necessary to preserve -depdency semantics--otherwise e.g. a fetch could happen inside one loop nest, +dependency semantics--otherwise e.g. a fetch could happen inside one loop nest, and then the instruction using that fetch could be inside a wholly different loop nest. @@ -190,19 +192,34 @@ These are usually key-value pairs. The following attributes are recognized: to be in addition to the ones found by the heuristic described above. * ``dep=id1:id2`` creates a dependency of this instruction on the - instructions with identifiers ``id1`` and ``id2``. This requires that the - code generated for this instruction appears textually after both of these - instructions' generated code. + instructions with identifiers ``id1`` and ``id2``. The meaning of this + dependency is that the code generated for this instruction is required to + appear textually after all of these dependees' generated code. Identifiers here are allowed to be wildcards as defined by - the Python module :mod:`fnmatchcase`. + the Python module :mod:`fnmatchcase`. This is helpful in conjunction + with ``id_prefix``. .. note:: - If this is not specified, :mod:`loopy` will automatically add - depdencies of reading instructions on writing instructions *if and - only if* there is exactly one writing instruction for the written - variable (temporary or argument). + Since specifying all possible dependencies is cumbersome and + error-prone, :mod:`loopy` employs a heuristic to automatically find + dependencies. Specifically, :mod:`loopy` will automatically add + a dependency to an instruction reading a variable if there is + exactly one instruction writing that variable. ("Variable" here may + mean either temporary variable or kernel argument.) + + If each variable in a kernel is only written once, then this + heuristic should be able to compute all required dependencies. + + Conversely, if a variable is written by two different instructions, + all ordering around that variable needs to be specified explicitly. + It is recommended to use :func:`get_dot_dependency_graph` to + visualize the dependency graph of possible orderings. + + You may use a leading asterisk ("``*``") to turn off the single-writer + heuristic and indicate that the specified list of dependencies is + exhaustive. * ``priority=integer`` sets the instructions priority to the value ``integer``. Instructions with higher priority will be scheduled sooner, @@ -381,4 +398,4 @@ Flags .. autoclass:: LoopyFlags -.. vim: tw=75 +.. vim: tw=75:spell diff --git a/doc/tutorial.rst b/doc/tutorial.rst new file mode 100644 index 0000000000000000000000000000000000000000..8c447ef876e67721a9247defc58e9588cdec144c --- /dev/null +++ b/doc/tutorial.rst @@ -0,0 +1,22 @@ +.. _tutorial + +Tutorial +======== + +This guide provides a gentle introduction into what loopy is, how it works, and +what it can do. In doing so, some information is omitted or glossed over for +sake of clarity. There's also the :ref:`reference` that clearly defines all +aspects of loopy. + +Loopy's View of a Kernel +------------------------ + +.. literalinclude:: ../examples/rank-one.py + :start-after: SETUPBEGIN + :end-before: SETUPEND + +This example is included in the :mod:`loopy` distribution as +:download:`examples/rank-one.py <../examples/rank-one.py>`. + + +.. vim: tw=75 diff --git a/examples/rank-one.py b/examples/rank-one.py new file mode 100644 index 0000000000000000000000000000000000000000..c362b64cba17d0154ef881401938612670ae1932 --- /dev/null +++ b/examples/rank-one.py @@ -0,0 +1,71 @@ +# SETUPBEGIN +import numpy as np +import pyopencl as cl +import loopy as lp + +ctx = cl.create_some_context() +queue = cl.CommandQueue(ctx) + +knl = lp.make_kernel(queue.device, + "{[i,j]: 0<=i,j