Starting transition towards smaller number of vector kernels. Started with...
Starting transition towards smaller number of vector kernels. Started with generic kernel v1 = alpha * v2 + beta * v3, where alpha and beta are either CPU or GPU scalars and might have their signs flipped or the reciprocal taken inside the kernel.
Loading
Please sign in to comment