This changes the count granularity for local work (math ops and local mem) from workitem to subgroup.