- Feb 22, 2013
-
-
Karl Rupp authored
-
- Feb 21, 2013
-
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
* Added guard for not using AMD GEMM kernels if device has less than 20 kB of local memory. * Fixed a warning for GEMM kernels (fast NVIDIA version).
-
Karl Rupp authored
-
- Feb 20, 2013
-
-
Karl Rupp authored
-
Karl Rupp authored
Added fast GEMM kernels for AMD Tahiti based on input from Philippe's autotuner. Only works for square matrices with dimensions being a multiple of 256.
-
Karl Rupp authored
-
Karl Rupp authored
Fixed a warning (/* in comment) in CL/cl_gl_ext.h (staying with OpenCL 1.1. The fix is already in OpenCL 1.2 headers)
-
Karl Rupp authored
Replaced all uses of size_t by std::size_t (exception: viennacl/generator/*, which will be replaced soon anyway)
-
Karl Rupp authored
-
- Feb 19, 2013
-
-
Karl Rupp authored
* Fixed some corner cases for BLAS-1-type operations on vectors. * Removed bottleneck in sparse-test-XYZ (manually transposing ublas-matrix).
-
- Feb 18, 2013
-
-
Karl Rupp authored
* Fixed an access violation in copy() from STL to device for compressed_matrix with empty rows.
-
- Feb 13, 2013
- Feb 12, 2013
-
-
Karl Rupp authored
Added support for Xeon Phi (using 128x128 work items). Adjusted number of work groups on CPU to power of two (six-core CPUs otherwise lead to problems with reductions).
-
- Feb 07, 2013
- Feb 04, 2013
-
-
Karl Rupp authored
-
- Feb 03, 2013
-
-
Karl Rupp authored
Fixed bug in AMG: Removed unnecessary .resize() on empty operator matrices on GPU. Thanks to Jakub Pola for reporting.
-
- Jan 30, 2013
-
-
Karl Rupp authored
-
- Jan 16, 2013
-
-
Karl Rupp authored
Added fix for segfaults on program exit when providing custom OpenCL queues (thanks to Denis Demidov for reporting)
-
Karl Rupp authored
* Fixed issues with copy back to host when OpenCL handles are passed to CTORs of vector, matrix, or compressed_matrix (thanks to Jakub Pola for reporting) * vector.empty() is now const (as it is supposed to be)
-
- Dec 14, 2012
-
-
Karl Rupp authored
* Added an overload for result_of::alignment for vector_expression (again thanks to Denis) * Resolved a problem with the include order in the SPAI preconditioner when using g++ 4.7
-
- Dec 09, 2012
-
-
Karl Rupp authored
-
- Dec 02, 2012
-
-
-
Karl Rupp authored
* Added Doxygen comments to generated OpenCL kernel (source) files * Ensured newline at end of each source file * Specified AMD APP SDK 2.7 error: "on Linux" * Makefile for manual now cleans up additional temporary files
-
Karl Rupp authored
* Improved preamble in tutorials and benchmarks * Mixed CG tag now provides an inner_tolerance() for the low-precision loop
-
Karl Rupp authored
* Added Doxygen comments to all namespaces. * Updated comments on host-based implementations to clearly state (optional) OpenMP usage.
-
- Dec 01, 2012
-
-
Karl Rupp authored
* Removed MSVC-switch in tutorials and benchmarks for reading files (require users to run from build/ accross different OS) * Updated old Eigen-code to version 3.x * Fixed a few more warnings in Visual Studio, added /wd4996 flag to get rid of VC iterator advertisements * Fixed an overly strict assert() on vector-reductions with OpenCL, including a clean initialization of reduction vector * Changed STL overload of norm_X from enable-if to plain overloading, otherwise MSVC has problems.
-
Karl Rupp authored
* Added finish() before copy() in tests in order to resolve issues with AMD APP SDK
-
Karl Rupp authored
* Fixed all warnings obtained in Visual Studio 2005 and 2010 * Reverted SFINAE in CTOR for vector to separate overloads for vector_range and vector_slice (does not work with VS 2005) * Moved default-implementation for predicates to forwards.h, otherwise Visual Studio does not recognize forward definitions properly * Removed unnecessary Boost.filesystem and Boost.system components check from dist-package * Adjusted version number in Doxyfile and CMakeLists.txt
-
- Nov 30, 2012
-
-
Karl Rupp authored
* Added least_squares and iterative to CUDA-examples * Fixed a minor flaw in viennacl-info
-
Karl Rupp authored
* viennacl-info now prints informations for all available platforms. * user-provided OpenCL context is no longer free'd at exit (inc() on handle after assignment). * Added Philippe's input to changelogs
-