Global memory barriers
Enables the user to specify that a local barrier should synchronize global memory (i.e. via the CLK_GLOBAL_MEM_FENCE in OpenCL) through the instruction options. Note that this does not automatically determine the memory type, which would likely be a valuable addition in the future -- if not specified, local memory is used, the default previously.
edit: I think we actually get the memory type for free, at least from barriers inserted schedule.insert_barriers as we can simply pull the memory type from the DependencyRecord.var_kind
AFAIK, there is no equivalent to this selective memory synchronization in any other target, but I will gladly implement the minor changes that would be needed in the target's emit_barrier
code if anyone points them out