Sequential loop bounds generation does not obey projection semantics
knl = lp.make_kernel(
"{[i, loc1, loc2]: 0 <= loc1 <= 1 and 0 <= loc2 <= 2"
" and 0 <= i <= loc1 and 0 <= i <= loc2}",
"""
<>tmp[loc2] = 0
for i
tmp[i] = 1 {inames=i:loc2}
end
out[loc2] = tmp[loc2]
""",
"...",
seq_dependencies=True)
knl = lp.tag_inames(knl, dict(loc1="l.0", loc2="l.0"))
knl = lp.set_temporary_scope(knl, "tmp", "local")
This kernel returns [1, 1, 0] but I would expect it to return [1, 1, 1]. The root cause appears to be that the loop bounds for i are too tight - they rely on both loc1 and loc2. I think they should be using only loc2, to be consistent with the projection semantics of loopy.
(Part of) the fix is that get_usable_inames_for_conditional() should use only the common set of parallel inames in the block. Right now it returns both loc1 and loc2.
Edited by Matt Wala