- Nov 04, 2014
-
-
Karl Rupp authored
Resolves #49
-
Karl Rupp authored
-
Karl Rupp authored
Either vector_base/matrix_base always create shallow copies in their copy-CTOR, or they create deep copies in their copy-CTOR. The side-effects of shallow copies can be corrupted data, leading to wrong results. On the contrary, deep copies are likely to have only poor performance as a side-effect, which is not as bad. The only use for shallow copies is within proxy objects, which now have suitable overloads for their copy-CTOR. Resolves #60
-
Karl Rupp authored
Now avoiding unnecessary temporary buffers.
-
Karl Rupp authored
FFT: Rewrote to fix VS2013 compilation errors
-
Karl Rupp authored
-
- Nov 03, 2014
-
-
Matthew Musto authored
Assuming the complex numbers are in Cartesian form this should be identical in function to the prior function.
-
Karl Rupp authored
Use of 'uint' is rejected by some compilers, e.g. Visual Studio.
-
Karl Rupp authored
-
Karl Rupp authored
Never include this without a separate switch for Apple systems.
-
Karl Rupp authored
-
Karl Rupp authored
Resolves #68. Resolves #76.
-
Karl Rupp authored
-
Karl Rupp authored
v1 -= pow(v1, v2); might be poorly conditioned if v1 is close to 1.0. This fix lifts the values of v1 to be at least 1.1, hopefully fixing the repeated issues seen in the nightly tests in the past.
-
Philippe Tillet authored
-
- Nov 02, 2014
-
-
Karl Rupp authored
Execution times on CPU are otherwise excessive. Provide runtime flag for switching sizes later.
-
Karl Rupp authored
About a factor of 20 faster than previous implementation. I estimate that more microtuning can get another factor of 2. Higher performance gains will most likely require intrinsics.
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
Apparently incomplete refactoring from output of tuner.
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
-
Karl Rupp authored
Older devices such as a GeForce 8600 GT have been mapped incorrectly. The new detection fixes this for the devices currently available. Needs to be revisited once the post-Maxwell generation is out.
-
- Oct 31, 2014
-
-
Karl Rupp authored
Some macros have not been properly prefixed. Now every proprocessor macro carries a prefix 'VIENNACL_'.
-
Karl Rupp authored
The default local workgroup size is now 128, providing better compatibility with weaker and older hardware. Also, preprocessor defines are properly guarded with a VIENNACL-prefix.
-
Karl Rupp authored
-
Karl Rupp authored
The following is now legitimate: std::vector<T> x(N); viennacl::vector<T> vcl_x; viennacl::copy(x, vcl_x); and resizes the ViennaCL vector accordingly (in the default context). A similar support for the iterator interface is not possible without substantial refactoring.
-
Karl Rupp authored
Resulted in a conversion warning and in a nightly test failure.
-
- Oct 29, 2014
-
-
Karl Rupp authored
Resolves #71.
-
Karl Rupp authored
This way a user can conveniently free memory. Resolves #98.
-
Karl Rupp authored
If a user has provided her own memory buffers, the use of copy() to update the matrix should reside in the same user-provided buffer and not create a new one. Also, memory_write() is usually faster than memory_create(). Fixes #77.
-
- Oct 28, 2014
-
-
Karl Rupp authored
Requires Doxygen 1.8.x or above.
-
- Oct 27, 2014
-
-
Philippe Tillet authored
Could cause some linkage issue when matrix_proxy.hpp was included but not matrix.hpp
-
- Oct 24, 2014
-