Incorrect type inference when preprocess_kernel is invoked earlier.
The snippet:
x = cl_ranom.rand(queue, n, np.float32)
u = cl_random.rand(queue, n, np.float32)
knl = lp.make_kernel(
"{[i]: 0<=i<16}",
"""
y[i] = sin(log(x[i])){id=insn1}
z[i] = tan(y[i]) + cos(x[i]) + 2*u[i]{id=insn2}
""")
knl = lp.set_options(knl, "write_cl")
knl = lp.preprocess_kernel(knl)
evt, out = knl(queue, x=x, u=u)
generates the following code:
__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) loopy_kernel(__global
float const *__restrict__ u, __global float const *__restrict__ x, __global
float *__restrict__ y, __global int *__restrict__ z)
{
for (int i = 0; i <=15; ++i)
y[i] = sin(log(x[i]));
z[i] = tan(y[i]) + cos(x[i]) + 2 * u[i];
}
The inferred type of z
is int
which is not expected. The dtype int
has been propagated from insn2
during the lp.preprocess_kernel(...)
call. The type of z
got updated based on the type information only from the known variable, which is 2
(an int
) in this case.
The infer_unknown_dtypes
should mark the dtype of z
as unknown until the types of all the dependency variables of z
have been resolved.