Transpose matrix extended on CUDA, OpenCl, OpenMP. Added function inplace transpose of matrix ( A = trans(A) ).