Skip to content

WIP: Nest and predicate statements

To resolve issues where some instructions aren't within inames which are, e.g., local hardware axes, this MR provides a transformation to nest such instructions within that loop and adding a predicate to only evaluate once.

Consider the following variant of the convolution kernel in test_convolution in test_apps.py:

knl = lp.split_iname(knl, "im_x", 2, outer_tag="g.0", inner_tag="l.0")
knl = lp.split_iname(knl, "im_y", 2, outer_tag="g.1", inner_tag="l.1")
knl = lp.tag_inames(knl, dict(ifeat="l.2"))
knl = lp.add_prefetch(knl, "img", "im_x_inner, im_y_inner, f_x, f_y",
                        default_tag="l.auto")
knl = nest_and_predicate_instructions(knl, "ifeat", "id:img_fetch_rule")

The kernel will schedule if silenced_warnings=['write_race_local(img_fetch_rule)','iname-order'], but the code produces is incorrect (even though it seems to be correctly nesting and predicating the fetch instruction). Without nest_and_predicate_instructions the kernel schedules and is correct, but the same prefetch is occurring redundantly for all ifeat/l.2.

Merge request reports