- Jun 30, 2015
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Jun 19, 2015
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
add option to compile nvcc's fatbin
-
- Jun 17, 2015
-
-
Vladimir Rutsky authored
Useful when compiling CUDA shaders on build server without NVidia GPU (but with NVidia CUDA Toolkit) and specifiying target GPU architecture with `options` argument.
-
Vladimir Rutsky authored
Fatbin embeds prebuild for set of specified real architectures cubin's and PTX assemblies for set of specified virtual architectures, allowing driver to load prebuild cubin if there is any for current real GPU, or to assemble PTX from closest virtual GPU architecture. This can be used for distributing CUDA-powered applications without need to install NVidia CUDA Toolkit (nvcc) and development environment on target platform. .cu files can be precompiled to fatbin's on build server with fatbin = pycuda.compiler.compile( cu_file_text, options=[ "-gencode", "arch=compute_20,code=compute_20", "-gencode", "arch=compute_20,code=sm_20", "-gencode", "arch=compute_30,code=compute_30", "-gencode", "arch=compute_30,code=sm_30", ], target="fatbin") fatbin's can be distributed on machines without nvcc and loaded using cuda_module = pycuda.driver.module_from_buffer(fatbin)
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Jun 16, 2015
-
-
Andreas Klöckner authored
add option to compile device-independent PTX files with pycuda.compiler.compile()
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Vladimir Rutsky authored
-
- May 30, 2015
-
-
Andreas Klöckner authored
Port examples to Python 3.
-
- May 29, 2015
-
-
Dzhelil Rufat authored
-
Dzhelil Rufat authored
-
Dzhelil Rufat authored
-
- May 27, 2015
-
-
Andreas Klöckner authored
Add async memset methods to driver, uncomment peer async copy method
-
- May 26, 2015
-
-
Alex Park authored
-
- May 23, 2015
-
-
Alex Park authored
-
- May 13, 2015
-
-
Andreas Klöckner authored
-
- May 11, 2015
-
-
Andreas Klöckner authored
-
- May 03, 2015
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Mar 04, 2015
-
-
Andreas Klöckner authored
-
- Jan 27, 2015
-
-
Andreas Klöckner authored
Add support for unknown dimension in reshape
-
Thomas Unterthiner authored
-
- Dec 19, 2014
-
-
Andreas Klöckner authored
BUG: fix to demo_struct.py for recent CUDA 4.0+
-
- Dec 15, 2014
-
-
Gregory R. Lee authored
-
Gregory R. Lee authored
-
Gregory R. Lee authored
wrap int32 and intp arguments to memcpy_htod in numpy.getbuffer() to avoid AttributeError. fix prepare/prepared_call for CUDA v4.0+
-
- Nov 12, 2014
-
-
Andreas Klöckner authored
-
- Nov 07, 2014
-
-
Andreas Klöckner authored
Use appdirs package to create cache in XDG-correct location
-
Bruce Merry authored
This fixes #54.
-
- Oct 31, 2014
-
-
Andreas Klöckner authored
-
- Oct 16, 2014
-
-
Andreas Klöckner authored
-
- Oct 08, 2014
-
-
Andreas Klöckner authored
-
- Sep 22, 2014
-
-
Andreas Klöckner authored
-
- Sep 19, 2014
-
-
Andreas Klöckner authored
-