Skip to content

Incorrect type inference when preprocess_kernel is invoked earlier.

The snippet:

x = cl_ranom.rand(queue, n, np.float32)
u = cl_random.rand(queue, n, np.float32)

knl = lp.make_kernel(
        "{[i]: 0<=i<16}",
        """
        y[i] = sin(log(x[i])){id=insn1}
        z[i] = tan(y[i]) + cos(x[i]) + 2*u[i]{id=insn2}
        """)

knl = lp.set_options(knl, "write_cl")
knl = lp.preprocess_kernel(knl)

evt, out  = knl(queue, x=x, u=u)

generates the following code:

__kernel void __attribute__ ((reqd_work_group_size(1, 1, 1))) loopy_kernel(__global
        float const *__restrict__ u, __global float const *__restrict__ x, __global
        float *__restrict__ y, __global int *__restrict__ z)
{ 
    for (int i = 0; i <=15; ++i) 
        y[i] = sin(log(x[i]));
        z[i] = tan(y[i]) + cos(x[i]) + 2 * u[i];
}

The inferred type of z is int which is not expected. The dtype int has been propagated from insn2 during the lp.preprocess_kernel(...) call. The type of z got updated based on the type information only from the known variable, which is 2 (an int) in this case.

The infer_unknown_dtypes should mark the dtype of z as unknown until the types of all the dependency variables of z have been resolved.