Fix loopy.statistics for kernel callables
This is a large refactoring, with many pieces:
- Counts from subkernels are incorporated using
  subst_into_{pwqpolynomial,guarded_pwqpolynomial,to_count_map}.
  This replaces a prior, broken scheme that existed on the kernel
  callables branch.
- Separate ToCountMap and ToCountPolynomialMap, i.e. separate
  to-count map types by their value type. The latter type
  now knows (and checks) its isl space.
- The numpy_types argument is now deprecated and ignored, it
  did not seem to do anything previously.
- Introduce Sync() count key for synchronization counting.
- Code/robustness cleanups in the ToCountMap* types.
- All op descriptors now carry a kernel_name.
There are still a few FIMXEs, mainly the SUBGROUP granularity and the
footprint gatherer.
Loading
Please sign in to comment