- Dec 16, 2012
-
-
Andreas Klöckner authored
-
- Dec 15, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
-
- Dec 13, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Dec 12, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Dec 08, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Dec 06, 2012
-
-
Andreas Klöckner authored
-
- Nov 27, 2012
-
-
Andreas Klöckner authored
-
- Nov 11, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Nov 10, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Nov 09, 2012
-
-
Andreas Klöckner authored
-
- Oct 24, 2012
-
-
Andreas Klöckner authored
Made some changes to the benchmark example... let me know what you think.
-
dhj authored
* Numpy does element-wise operations by default. Updated the cpu operation to use pure numpy. * Eliminated the loop which is not necessary to demonstrate parallelism on array operations. * Made the number of workers explicit rather than gpu chosen, through local_size variable passed to kernel execution. * Increased to ~8 million data points to more clearly demonstrate the difference between cpu and gpu based computations.
-
- Oct 07, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Oct 06, 2012
-
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
Andreas Klöckner authored
-
- Oct 05, 2012
-
-
Andreas Klöckner authored
-