Skip to content

Adding all local strides to mem access

This change to how we track strides in MemAccess gives more information that is more precise and reduces guessing.

Old version: Previously we set the MemAccess.stride for global accesses to the stride of lid0 if lid0 was present in the index. If no local ids were present in the index, we set the stride to 0. If lid0 was not present in the index, but other local ids were present, we set stride = sys.maxsize. (I think this should have instead been stride 0 as well, not sure why I did it this way before.)

New version: Now we set MemAccess.lid_strides to a dict mapping the local ids found in the index to their strides. This provides more information about the memory access pattern and is necessary for later computing a buswidths per sub-group estimate. It also gives a more precise description of accesses that may be considered uniform. For example, lid_strides={} if no local ids were found, lid_strides={1:X, 2:Y, ... } if no local id 0 was found, and lid_strides={0:0, ... } if a local id 0 is found and its stride is 0. Providing all the local strides tells us more than simply setting stride=0.

Merge request reports