C execution
This adds preliminary support of CTarget execution to loopy. I wanted to put this up for some early feedback, to see if we want any major changes before I go further
Todo:
- The default (C) implementation of the ExecutionWrapperGeneratorBase in execution.py should be moved into a C-specific implementation in c_execution.py
- Add more tests, I'm not sure exactly what would be appropriate to add here. A comprehensive approach would be "everything but vectorized kernels", but that seems like overkill. At the very least, I should add some ILP/UNR enabled tests.
- Caching. I don't understand it :P
- Figure out what the heck is going on with the python_dtype_str in the ExecutionWrapperGeneratorBase. For some reason, the np.float32 dtype wasn't showing as "builtin" (I haven't tested this since yesterday)
Edited by Nick Curtis