Commits · 9da8e4e6916b06c6b32dd53817df455abc924c32 · Kaushik Kulkarni / viennacl-dev

Nov 08, 2014
- Memory: Added overload of default_memory_type() to switch default memory domain. · 9da8e4e6
  Karl Rupp authored Nov 08, 2014
```
Makes it much easier to e.g. only use OpenCL even though CUDA is enabled.
Since this relies on a singleton, the mechanism is not thread-safe.
```
  9da8e4e6
- CUDA: Fixed incomplete refactoring from Doxygen-fixes two commits earlier. · 2ea16808
  Karl Rupp authored Nov 08, 2014
  
  2ea16808
- Doxygen: Removed old LaTeX files, added logos to start page. · a82e40a3
  Karl Rupp authored Nov 08, 2014
```
Migration to Doxygen 1.8.x complete. Resolves #18.
```
  a82e40a3
- Doxygen: Cleanup and fixed warnings. · 43ba114c
  Karl Rupp authored Nov 08, 2014
  
  43ba114c
- Doxygen: Updated hardware table. · 6927bd05
  Karl Rupp authored Nov 08, 2014
  
  6927bd05
- Doxygen: Changed label prefix for manual from "manual-page-" to "manual-" · b6b92d08
  Karl Rupp authored Nov 08, 2014
```
Results in shorter URLs.
```
  b6b92d08
- README, CMake: Updates for release 1.6.0 · 1dcc8aa7
  Karl Rupp authored Nov 08, 2014
  
  1dcc8aa7
- GMRES: Fixed problems with pipelined implementation for coordinate_matrix. · b7cce55e
  Karl Rupp authored Nov 08, 2014
```
A temporary buffer wasn't flushed, still contained old values.
```
  b7cce55e
- GMRES: Proper support for early conversion in pipelined implementation. · 490566d4
  Karl Rupp authored Nov 08, 2014
```
Now deals correctly with very small systems for which the maximum iteration
count is larger than the system size.
```
  490566d4
- Scheduler: Fixed incorrect handling of A += trans(B) and A -= trans(B) · faeb1071
  Karl Rupp authored Nov 08, 2014
```
Uses temporaries C = trans(B) in order to avoid troubles with corner
cases like A += trans(A).
```
  faeb1071
Nov 07, 2014
- Scheduler: Added support for composite operations including unary operations. · b5909a76
  Karl Rupp authored Nov 07, 2014
```
Enables support for operations like
 C = fabs(A - trans(B))
and is supposed to work with both matrix and vector expressions.
```
  b5909a76
- Clang 3.5: Fixed warnings at high warning levels. · 8b4567bc
  Karl Rupp authored Nov 07, 2014
```
Flags used: -Wall -Wextra -Weverything -pedantic -Werror
-Wno-exit-time-destructors -Wno-global-constructors -Wno-padded -Wno-weak-vtables
-Wno-documentation -Wno-old-style-cast -Wno-switch-enum
```
  8b4567bc
- Clang 3.0: Fixed warnings related to unused macros in release mode in tests. · 701a727a
  Karl Rupp authored Nov 07, 2014
  
  701a727a
- NMF: Moved header back to viennacl/linalg/nmf.hpp · 102265a5
  Karl Rupp authored Nov 07, 2014
```
For reasons of backwards compatibility and uniformity with other headers.
```
  102265a5
- Clang 3.0: Fixed warnings at high warning levels. · 6918bf3f
  Karl Rupp authored Nov 07, 2014
```
Flags used: -Wall -Wextra -Weverything -pedantic -Werror
-Wno-exit-time-destructors -Wno-global-constructors -Wno-padded -Wno-weak-vtables
```
  6918bf3f
- Changelog: Updated for 1.6.0 release. · 771aa713
  Karl Rupp authored Nov 07, 2014
  
  771aa713
- Doxygen: Added warnings on mixing 'auto' with expression templates. · 01534425
  Karl Rupp authored Nov 07, 2014
```
Resolves #82 to the extent possible. A bullet-proof fix for 'auto'
requires substantial refactoring and internal changes, which
won't happen prior to ViennaCL 2.0.0.
```
  01534425
- Doxygen: Added C++11 question on FAQ · 7e40e764
  Karl Rupp authored Nov 07, 2014
  
  7e40e764
- OpenCL: Fixed problems in vector operations. · 5f3e50a7
  Karl Rupp authored Nov 07, 2014
```
Introduced by commit 79c21e63
because of a not careful enough fixing of warnings.
```
  5f3e50a7
Nov 06, 2014
- Device-DB: Added AMD Barts architecture. · 8de7e270
  Karl Rupp authored Nov 06, 2014
```
Derived from a Radeon HD 6850 with a tuning run.
```
  8de7e270
- Device-DB: Added AMD Cedar architecture. · e92d8b47
  Karl Rupp authored Nov 06, 2014
```
Obtained on a Radeon HD 5450. Very low-end GPU, profile aims at compatibility
rather than performance.
```
  e92d8b47
- Device-DB: Added NVIDIA Tesla C2050. · a742346a
  Karl Rupp authored Nov 06, 2014
  
  a742346a
- Clang 3.0: Fixed a bunch of warnings at high warning levels. · 79c21e63
  Karl Rupp authored Nov 06, 2014
```
Flags used: -Wall -Wextra -Weverything -pedantic -Werror
-Wno-exit-time-destructors -Wno-global-constructors -Wno-padded -Wno-weak-vtables
```
  79c21e63
- GCC 4.6: Fixed all warnings with -Wall -pedantic -Wextra -Wconversion -g · f2d0cea0
  Karl Rupp authored Nov 06, 2014
  
  f2d0cea0
- GMRES: Added pipelined implementations for all three backends. · f8b4f301
  Karl Rupp authored Nov 06, 2014
```
Only requires four kernels per iteration, which is much better than
the Householder version. Implementation follows Algorithm 2.1 in
Walker, Zhou: "A Simpler GMRES" (1994)
```
  f8b4f301
- Device-DB: Added NVIDIA GeForce GTX 470. · 244e7d92
  Karl Rupp authored Nov 06, 2014
  
  244e7d92
- Device-DB: Added NVIDIA GTX 750 Ti. · d47b3ad2
  Karl Rupp authored Nov 06, 2014
  
  d47b3ad2
- sliced_ell_matrix: Fixed conversion warning in matrix-vector product. · 75d7013e
  Karl Rupp authored Nov 06, 2014
  
  75d7013e
Nov 05, 2014
- Fix OpenCL kernel cache so as not to use same binaries across different contexts · cf1d5e7b
  Toby Smithe authored Nov 05, 2014
  
  cf1d5e7b
- CUDA: Added CTOR for matrix to wrap user-provided buffer. · 34b3b3b1
  Karl Rupp authored Nov 05, 2014
```
Also includes a new example showing the use case.
Resolves #69.

Reported-by: Pushkar Ratnalikar via viennacl-devel
```
  34b3b3b1
- CUDA: Fixed missing else-clause when wrapping a CUDA buffer. · a563584d
  Karl Rupp authored Nov 05, 2014
```
Might have gotten lost during refactoring?
```
  a563584d
- CUDA: Fixed bug in as() kernel for scalars. · b8ad7ed9
  Karl Rupp authored Nov 05, 2014
  
  b8ad7ed9
- Device-DB: Using scalar types for vector operations. · ab10fb80
  Karl Rupp authored Nov 05, 2014
```
Vector types lead to compilation issues on NVIDIA GPUs with abs(), since
 x = abs(y)
does not compile due to incompatible vector types.
```
  ab10fb80
- sliced_ell_matrix: Fixed OpenMP race condition and use unsigned type for OpenMP loops. · e197b562
  Karl Rupp authored Nov 05, 2014
  
  e197b562
- GEMM on CPU: Fixed race condition with OpenMP. · b1e87684
  Karl Rupp authored Nov 05, 2014
  
  b1e87684
- Vector: Added min() and max() routines. · 64afe1e7
  Karl Rupp authored Nov 05, 2014
  
  64afe1e7
- CUDA: Added libstdc++ on MacOS Mavericks and above. · 159bb9f2
  Karl Rupp authored Nov 05, 2014
```
Discusssion here:
https://github.com/viennacl/viennacl-dev/issues/106
```
  159bb9f2
Nov 04, 2014
- Direct Solve: Moved details to detail-namespace, polished public interface. · 42e5de86
  Karl Rupp authored Nov 04, 2014
```
No more collisions with GMRES anymore. Resolves #61
```
  42e5de86
- Unified use of vcl_size_t instead of just size_t or std::size_t · 65224e30
  Karl Rupp authored Nov 04, 2014
```
Resolves #49
```
  65224e30
- Dense BLAS bench: Added device information and double precision check with OpenCL. · 12681a1b
  Karl Rupp authored Nov 04, 2014
  
  12681a1b