Skip to content
Snippets Groups Projects
Forked from Andreas Klöckner / loopy
3576 commits behind the upstream repository.
user avatar
Dominic Kempf authored
So far, only complexity in number of instructions was taken into
account. The constant is just a guess that fixes the test kernel
that introduced the failure.
2ffcd7c7
History

Loopy lets you easily generate the tedious, complicated code that is necessary to get good performance out of GPUs and multi-core CPUs.


Places on the web related to Loopy:


Loopy's core idea is that a computation should be described simply and then transformed into a version that gets high performance. This transformation takes place under user control, from within Python.

It can capture the following types of optimizations:

  • Vector and multi-core parallelism in the OpenCL/CUDA model
  • Data layout transformations (structure of arrays to array of structures)
  • Loopy Unrolling
  • Loop tiling with efficient handling of boundary cases
  • Prefetching/copy optimizations
  • Instruction level parallelism
  • and many more

Loopy targets array-type computations, such as the following:

  • dense linear algebra,
  • convolutions,
  • n-body interactions,
  • PDE solvers, such as finite element, finite difference, and Fast-Multipole-type computations

It is not (and does not want to be) a general-purpose programming language.

Loopy is licensed under the liberal MIT license and free for commercial, academic, and private use. All of Loopy's dependencies can be automatically installed from the package index after using:

pip install loo.py

In addition, Loopy is compatible with and enhances pyopencl.