- Aug 09, 2017
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Jul 20, 2017
-
-
Andreas Klöckner authored
update GPUArray documentation See merge request !5
-
Gregory R. Lee authored
updated the documentation and release notes to reflect this change. The default order or zeros_like, etc. is K to match numpy behavior.
-
Gregory R. Lee authored
-
- Jul 19, 2017
-
-
Gregory R. Lee authored
-
Gregory R. Lee authored
-
Andreas Klöckner authored
Gpuarray contig See merge request !4
-
- Jul 18, 2017
-
-
Gregory R. Lee authored
-
Gregory R. Lee authored
-
Gregory R. Lee authored
order = 'K' default behavior is needed for the .imag and .real properties of GPUArray to properly preserve the array order when the component is zero-valued. These options have been present in the numpy-equivalent functions since version 1.6.
-
Gregory R. Lee authored
add missing contiguity check and add reshape order tests
-
Gregory R. Lee authored
-
Gregory R. Lee authored
-
- Jul 14, 2017
-
-
Andreas Klöckner authored
-
- Jul 13, 2017
-
-
Andreas Klöckner authored
Fixing CUDA 9.0 build
-
Boris Fomitchev authored
-
- May 31, 2017
-
-
Andreas Klöckner authored
-
- May 24, 2017
-
-
Andreas Klöckner authored
-
- Mar 22, 2017
-
-
Andreas Klöckner authored
-
- Jan 15, 2017
-
-
Andreas Klöckner authored
Jit link See merge request !2
-
- Jan 14, 2017
- Jan 13, 2017
-
-
Lurch authored
-
Lurch authored
Changed TestDriver.test_recursive_launch() in test/test_deriver.py to use pycuda.compiler.DynamicSourceModule, removed xfail marker Added a test case for pycuda.compiler.JitLinkModule: compiles and links two .cu files. Removed test/test_jit_link.py, all further tests in test/test_deriver.py from now on Removed bad "import pycuda.autoinit" (how did this even get there?)
-
Lurch authored
Using Context.set_limit() now. Needed a missing limit enum CU_LIMIT_DEV_RUNTIME_SYNC_DEPTH (in CUDA8/v8.0/include/cuda.h[975], available since 3.5), added CU_LIMIT_DEV_RUNTIME_SYNC_DEPTH and CU_LIMIT_DEV_RUNTIME_PENDING_LAUNCH_COUNT (same) to pycuda.driver.limit.
-
Lurch authored
+ CudaModule + SourceModule + JitLinkModule + DynamicSourceModule Splitted the "old" class SourceModule in two: class CudaModule and the "new" SourceModule. - CudaModule is now the common base class for module loading and provides common methods. All methods here were moved from the old SourceModule. - The "new" SourceModule's interface and system requirements are 100% unchanged, it will work under all previous configuration scenarios. - JitLinkModule requires at least CUDA 5.5 and Compute Capabilty 3.5 (that is now guarded in the constructor), it's the swiss-army-knife for non-trivial linker invocations. - DynamicSourceModule is a special case of JitLinkModule, it exposes the same interface as SourceModule but enables dynamic parallelism (it comes with one extra optional argument in the constructor, cudalib_dir). It's meant for the trivial cases where the user has a single source file, as before with SourceModule. So if a PyCuda user only wants to activate dynamic parallelism, all that's required is to replace "SourceModule" with "DynamicSourceModule" given that we're able to locate "cudadevrt" automagically in method _locate_cuda_libdir(), otherwise the caller must provide the CUDA library path manually in constructor argument "cudalib_dir". I do not think this can be reduced any further. Other changes in class JitLinkModule: - Made all add_* methods and the link() method return self - Moved CUDA library path detection logic into method JitLinkModule._locate_cuda_libdir(), gets called only once from constructor
-
- Jan 12, 2017
- Jan 11, 2017
- Jan 10, 2017
- Oct 24, 2016
-
-
Andreas Klöckner authored
let setup.py detect CUDA_ROOT and lib directory for macOS
-
Jeong YunWon authored
-
- Oct 15, 2016
-
-
Andreas Klöckner authored
-
- Oct 10, 2016
-
-
Andreas Klöckner authored
-