l.0-parallel reduction codegen mishandles simul_reduce

As reported by @kaushikcfd on Riot:

import loopy as lp

knl = lp.make_kernel(
        "{[i]: 0<=i<4}",
        """
        a = simul_reduce(sum, i, 7*i)
        b = simul_reduce(sum, i, 10*i)
        """)

knl = lp.tag_inames(knl, "i:l.0")
knl = lp.realize_reduction(knl)

print(lp.generate_code_v2(knl).device_code())

generates two sets of inames.

Edited by Kaushik Kulkarni