Optimize *Executor.__call__ (#416)
* Micro-optimize the packing controller in execution * Don't use kwargs.pop for argument handling in PyOpenCLKernelExecutor.__call__
* Micro-optimize the packing controller in execution * Don't use kwargs.pop for argument handling in PyOpenCLKernelExecutor.__call__