Use stream in mul_add if given and allocator in subset_sum (#438)
* mul_add(): use stream if given, add an optional destination array. subset_sum(): use allocator if given. * Remove unused import, add whitespace to make flake8 happy * Update pycuda/gpuarray.py Co-authored-by: Andreas Klöckner <inform@tiker.net> * test_subset_sum: also assert if allocator is used * subset_sum: use given allocator value (even if None) like other functions --------- Co-authored-by: Andreas Klöckner <inform@tiker.net>