Newer
Older
.. include:: subst.rst
OpenCL Runtime: Memory
======================
.. currentmodule:: pyopencl
.. class:: MemoryObject
.. attribute:: info
Lower case versions of the :class:`mem_info` constants
may be used as attributes on instances of this class
to directly query info attributes.
.. attribute:: hostbuf
.. method:: get_info(param)
See :class:`mem_info` for values of *param*.
.. method:: release()
.. method:: get_host_array(shape, dtype, order="C")
Return the memory object's associated host memory
area as a :class:`numpy.ndarray` of the given *shape*,
*dtype* and *order*.
.. automethod:: from_int_ptr
.. autoattribute:: int_ptr
|comparable|
Memory Migration
.. function:: enqueue_migrate_mem_objects(queue, mem_objects, flags=0, wait_for=None)
:param flags: from :class:`mem_migration_flags`
.. versionadded:: 2011.2
Only available with CL 1.2.
.. class:: Buffer(context, flags, size=0, hostbuf=None)
Create a :class:`Buffer`.
See :class:`mem_flags` for values of *flags*.
If *hostbuf* is specified, *size* defaults to the size of
the specified buffer if it is passed as zero.
:class:`Buffer` inherits from :class:`MemoryObject`.
.. note::
Python also defines a type of `buffer object
<https://docs.python.org/3/c-api/buffer.html>`__,
and PyOpenCL interacts with those, too, as the host-side
target of :func:`enqueue_copy`. Make sure to always be
clear on whether a :class:`Buffer` or a Python buffer
object is needed.
Note that actual memory allocation in OpenCL may be deferred.
Buffers are attached to a :class:`Context` and are only
moved to a device once the buffer is used on that device.
That is also the point when out-of-memory errors will occur.
If you'd like to be sure that there's enough memory for
your allocation, either use :func:`enqueue_migrate_mem_objects`
(if available) or simply perform a small transfer to the
buffer. See also :class:`pyopencl.tools.ImmediateAllocator`.
.. method:: get_sub_region(origin, size, flags=0)
Only available in OpenCL 1.1 and newer.
.. method:: __getitem__(slc)
*slc* is a :class:`slice` object indicating from which byte index range
a sub-buffer is to be created. The *flags* argument of
:meth:`get_sub_region` is set to the same flags with which *self* was
created.
.. function:: enqueue_fill_buffer(queue, mem, pattern, offset, size, wait_for=None)
:arg mem: the on device :class:`Buffer`
:arg pattern: a buffer object (likely a :class:`numpy.ndarray`, eg.
``np.uint32(0)``). The memory associated with *pattern* can be reused or
freed once the function completes.
:arg size: The size in bytes of the region to be filled. Must be a multiple of the
size of the pattern.
:arg offset: The location in bytes of the region being filled in *mem*.
Must be a multiple of the size of the pattern.
Fills a buffer with the provided pattern
|std-enqueue-blurb|
Only available with CL 1.2.
.. versionadded:: 2011.2
.. _svm:
Shared Virtual Memory (SVM)
---------------------------
Shared virtual memory allows the host and the compute device to share
address space, so that pointers on the host and on the device may have
the same meaning. In addition, it allows the same memory to be accessed
by both the host and the device. *Coarse-grain* SVM requires that
buffers be mapped before being accessed on the host, *fine-grain* SVM
does away with that requirement.
.. warning::
Compared to :class:`Buffer`\ s, SVM brings with it a new concern: the
synchronization of memory deallocation. Unlike other objects in OpenCL,
SVM is represented by a plain (C-language) pointer and thus has no ability for
reference counting.
As a result, it is perfectly legal to allocate a :class:`Buffer`, enqueue an
operation on it, and release the buffer, without worrying about whether the
operation has completed. The OpenCL implementation will keep the buffer alive
until the operation has completed. This is *not* the case with SVM: Unless
otherwise specified, memory deallocation is performed immediately when
requested, and so SVM will be deallocated whenever the Python
garbage collector sees fit, even if the operation has not completed,
immediately leading to undefined behavior (i.e., typically, memory corruption and,
before too long, a crash).
Version 2022.2 of PyOpenCL offers substantially improved tools
for dealing with this. In particular, all means for allocating SVM
allow specifying a :class:`CommandQueue`, so that deallocation
is enqueued and performed after previously-enqueued operations
have completed.
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
.. _opaque-svm:
Opaque and "Wrapped-:mod:`numpy`" Styles of Referencing SVM
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When trying to pass SVM pointers to functionality in :mod:`pyopencl`,
two styles are supported:
- First, the opaque style. This style most closely resembles
:class:`Buffer`-based allocation available in OpenCL 1.x.
SVM pointers are held in opaque "handle" objects such as :class:`SVMAllocation`.
- Second, the wrapped-:mod:`numpy` style. In this case, a :class:`numpy.ndarray`
(or another object implementing the :c:func:`Python buffer protocol
<PyObject_GetBuffer>`) serves as the reference to an area of SVM.
This style permits using memory areas with :mod:`pyopencl`'s SVM
interfaces even if they were allocated outside of :mod:`pyopencl`.
Since passing a :class:`numpy.ndarray` (or another type of object obeying the
buffer interface) already has existing semantics in most settings in
:mod:`pyopencl` (such as when passing arguments to a kernel or calling
:func:`enqueue_copy`), there exists a wrapper object, :class:`SVM`, that may
be "wrapped around" these objects to mark them as SVM.
The commonality between the two styles is that both ultimately implement
the :class:`SVMPointer` interface, which :mod:`pyopencl` uses to obtain
the actual SVM pointer.
Note that it is easily possible to obtain a :class:`numpy.ndarray` view of SVM
areas held in the opaque style, see :attr:`SVMPointer.buf`, permitting
transitions from opaque to wrapped-:mod:`numpy` style. The opposite transition
(from wrapped-:mod:`numpy` to opaque) is not necessarily straightforward,
as it would require "fishing" the opaque SVM handle out of a chain of
:attr:`numpy.ndarray.base` attributes (or similar, depending on
the actual object serving as the main SVM reference).
See :ref:`numpy-svm-helpers` for helper functions that ease setting up the
wrapped-:mod:`numpy` structure.
Wrapped-:mod:`numpy` SVM tends to be a good fit for fine-grain SVM because of
the ease of direct host-side access, but the creation of the nested structure
that makes this possible is associated with a certain amount of cost.
By comparison, opaque SVM access tends to be a good fit for coarse-grain
SVM, because direct host access is not possible without mapping the array
anyway, and it has lower setup cost. It is of course entirely possible to use
opaque SVM access with fine-grain SVM.
.. versionchanged:: 2022.2
This version adds the opaque style of SVM access.
Using SVM with Arrays
^^^^^^^^^^^^^^^^^^^^^
While all types of SVM can be used as the memory backing
:class:`pyopencl.array.Array` objects, ensuring that new arrays returned
by array operations (e.g. arithmetic) also use SVM is easiest to accomplish
by passing an :class:`~pyopencl.tools.SVMAllocator` (or
:class:`~pyopencl.tools.SVMPool`) as the *allocator* parameter in functions
returning new arrays.
SVM Pointers, Allocations, and Maps
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autoclass:: SVMPointer
.. autoclass:: SVMAllocation
.. autoclass:: SVM
.. autoclass:: SVMMap
.. _numpy-svm-helpers:
Helper functions for :mod:`numpy`-based SVM allocation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. autofunction:: svm_empty
.. autofunction:: svm_empty_like
.. autofunction:: csvm_empty
.. autofunction:: csvm_empty_like
.. autofunction:: fsvm_empty_like
Operations on SVM
^^^^^^^^^^^^^^^^^
(See also :ref:`mem-transfer`.)
.. autofunction:: enqueue_svm_memfill
.. autofunction:: enqueue_svm_migratemem
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
.. class:: ImageFormat([channel_order, channel_type])
.. attribute:: channel_order
See :class:`channel_order` for possible values.
.. attribute:: channel_data_type
See :class:`channel_type` for possible values.
.. attribute:: channel_count
.. versionadded:: 0.91.5
.. attribute:: dtype_size
.. versionadded:: 0.91.5
.. attribute:: itemsize
.. versionadded:: 0.91.5
.. method:: __repr__
Returns a :class:`str` representation of the image format.
.. versionadded:: 0.91
|comparable|
.. versionchanged:: 0.91
Constructor arguments added.
.. versionchanged:: 2013.2
:class:`ImageFormat` was made comparable and hashable
.. function:: get_supported_image_formats(context, flags, image_type)
See :class:`mem_flags` for possible values of *flags*
and :class:`mem_object_type` for possible values of *image_type*.
.. class:: Image(context, flags, format, shape=None, pitches=None, hostbuf=None, is_array=False, buffer=None)
See :class:`mem_flags` for values of *flags*.
*shape* is a 2- or 3-tuple. *format* is an instance of :class:`ImageFormat`.
*pitches* is a 1-tuple for 2D images and a 2-tuple for 3D images, indicating
the distance in bytes from one scan line to the next, and from one 2D image
slice to the next.
If *hostbuf* is given and *shape* is *None*, then *hostbuf.shape* is
used as the *shape* parameter.
:class:`Image` inherits from :class:`MemoryObject`.
.. note::
If you want to load images from :class:`numpy.ndarray` instances or read images
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
back into them, be aware that OpenCL images expect the *x* dimension to vary
fastest, whereas in the default (C) order of :mod:`numpy` arrays, the last index
varies fastest. If your array is arranged in the wrong order in memory,
there are two possible fixes for this:
* Convert the array to Fortran (column-major) order using :func:`numpy.asarray`.
* Pass *ary.T.copy()* to the image creation function.
.. versionadded:: 0.91
.. versionchanged:: 2011.2
Added *is_array* and *buffer*, which are only available on CL 1.2 and newer.
.. attribute:: info
Lower case versions of the :class:`mem_info`
and :class:`image_info` constants
may be used as attributes on instances of this class
to directly query info attributes.
.. attribute:: shape
Return the value of the *shape* constructor argument as a :class:`tuple`.
.. method:: get_image_info(param)
See :class:`image_info` for values of *param*.
.. method:: release()
|comparable|
.. function:: image_from_array(ctx, ary, num_channels=None, mode="r", norm_int=False)
Build a 2D or 3D :class:`Image` from the :class:`numpy.ndarray` *ary*. If
*num_channels* is greater than one, the last dimension of *ary* must be
identical to *num_channels*. *ary* must be in C order. If *num_channels* is
not given, it defaults to 1 for scalar types and the number of entries
for :ref:`vector-types`.
The :class:`ImageFormat` is chosen as the first *num_channels* components
of "RGBA".
:param mode: "r" or "w" for read/write
.. note::
When reading from the image object, the indices passed to ``read_imagef``
are in the reverse order from what they would be when accessing *ary* from
Python.
If *norm_int* is *True*, then the integer values are normalized to a floating
point scale of 0..1 when read.
.. versionadded:: 2011.2
.. function:: enqueue_fill_image(queue, mem, color, origin, region, wait_for=None)
:arg color: a buffer object (likely a :class:`numpy.ndarray`)
|std-enqueue-blurb|
Only available with CL 1.2.
.. versionadded:: 2011.2
Transfers
.. autofunction:: enqueue_copy(queue, dest, src, **kwargs)
.. autofunction:: enqueue_fill(queue, dest, src, **kwargs)
Mapping Memory into Host Address Space
.. autoclass:: MemoryMap
.. function:: enqueue_map_buffer(queue, buf, flags, offset, shape, dtype, order="C", strides=None, wait_for=None, is_blocking=True)
|explain-waitfor|
*shape*, *dtype*, and *order* have the same meaning
as in :func:`numpy.empty`.
See :class:`map_flags` for possible values of *flags*.
*strides*, if given, overrides *order*.
:return: a tuple *(array, event)*. *array* is a
:class:`numpy.ndarray` representing the host side
of the map. Its *.base* member contains a
:class:`MemoryMap`.
.. versionchanged:: 2011.1
*is_blocking* now defaults to True.
.. versionchanged:: 2013.1
*order* now defaults to "C".
.. versionchanged:: 2013.2
Added *strides* argument.
Sample usage::
mapped_buf = cl.enqueue_map_buffer(queue, buf, ...)
with mapped_buf.base:
# work with mapped_buf
...
# memory will be unmapped here
.. function:: enqueue_map_image(queue, buf, flags, origin, region, shape, dtype, order="C", strides=None, wait_for=None, is_blocking=True)
|explain-waitfor|
*shape*, *dtype*, and *order* have the same meaning
as in :func:`numpy.empty`.
See :class:`map_flags` for possible values of *flags*.
*strides*, if given, overrides *order*.
:return: a tuple *(array, event)*. *array* is a
:class:`numpy.ndarray` representing the host side
of the map. Its *.base* member contains a
:class:`MemoryMap`.
.. versionchanged:: 2011.1
*is_blocking* now defaults to True.
.. versionchanged:: 2013.1
*order* now defaults to "C".
.. versionchanged:: 2013.2
Added *strides* argument.
Samplers
.. method:: __init__(context, normalized_coords, addressing_mode, filter_mode)
*normalized_coords* is a :class:`bool` indicating whether
to use coordinates between 0 and 1 (*True*) or the texture's
natural pixel size (*False*).
See :class:`addressing_mode` and :class:`filter_mode` for possible
argument values.
Also supports an alternate signature ``(context, properties)``.
:arg properties: a sequence
of keys and values from :class:`sampler_properties` as accepted
by :c:func:`clCreateSamplerWithProperties` (see the OpenCL
spec for details). The trailing *0* is added automatically
and does not need to be included.
This signature Requires OpenCL 2 or newer.
.. versionchanged:: 2018.2
The properties-based signature was added.
.. attribute:: info
Lower case versions of the :class:`sampler_info` constants
may be used as attributes on instances of this class
to directly query info attributes.
.. method:: get_info(param)
See :class:`sampler_info` for values of *param*.
.. automethod:: from_int_ptr
.. autoattribute:: int_ptr
|comparable|
.. class:: Pipe(context, flags, packet_size, max_packets, properties=())
See :class:`mem_flags` for values of *flags*.
:arg properties: a sequence
of keys and values from :class:`pipe_properties` as accepted
by :c:func:`clCreatePipe`. The trailing *0* is added automatically
and does not need to be included.
(This argument must currently be empty.)
This function requires OpenCL 2 or newer.
.. versionchanged:: 2021.1.7
*properties* now defaults to an empty tuple.
.. method:: get_pipe_info(param)
See :class:`pipe_info` for values of *param*.
Type aliases
------------
.. currentmodule:: pyopencl._cl
.. class:: Buffer
See :class:`pyopencl.Buffer`.