Skip to content

WIP: Noncontiguous array support for elementwise operations

Keegan Owsley requested to merge elementwise_noncontig into master

Adds support for noncontiguous arrays when performing elementwise operations.

The old behavior (no noncontig support) is the default when creating a new elementwise kernel. To request a kernel with noncontiguous array support, pass use_strides=True into get_elwise_kernel, and modify calls to func.prepared_async_call to include the new device_shape_and_strides property of gpuarrays. Like so:

        func.prepared_async_call(self._grid, self._block, None,
                self.gpudata, other.gpudata, result.gpudata,
                self.mem_size,
                self.device_shape_and_strides,
                other.device_shape_and_strides,
                result.device_shape_and_strides)

This PR modifies the operator passed into get_elwise_kernel using a regex to insert the stride- and shape-related computations. There are two versions of this regex: one that uses the standard python library re module, and a more robust one that uses the regex module. The first regex fails on cases where there are nested array accesses. To switch between them, change recursive_match_outer_brackets=True on line 43 of elementwise.py.

For now, only 2D arrays are supported. The number of dimensions can be increased trivially by changing the max_dims parameter in get_elwise_module_noncontig and the __max_dims__ class attribute of gpuarray. This should probably be smartly determined in the future, ideally in a way that doesn't require compiling a new kernel for each combination of array shapes.

Merge request reports

Loading