Starting transition towards smaller number of vector kernels. Started with...
Starting transition towards smaller number of vector kernels. Started with generic kernel v1 = alpha * v2 + beta * v3, where alpha and beta are either CPU or GPU scalars and might have their signs flipped or the reciprocal taken inside the kernel.
Loading
Please register or sign in to comment