Fix loopy.statistics for kernel callables
This is a large refactoring, with many pieces: - Counts from subkernels are incorporated using subst_into_{pwqpolynomial,guarded_pwqpolynomial,to_count_map}. This replaces a prior, broken scheme that existed on the kernel callables branch. - Separate ToCountMap and ToCountPolynomialMap, i.e. separate to-count map types by their value type. The latter type now knows (and checks) its isl space. - The numpy_types argument is now deprecated and ignored, it did not seem to do anything previously. - Introduce Sync() count key for synchronization counting. - Code/robustness cleanups in the ToCountMap* types. - All op descriptors now carry a kernel_name. There are still a few FIMXEs, mainly the SUBGROUP granularity and the footprint gatherer.
Loading
Please register or sign in to comment