Numpy stride checking is too strict

Numpy ignores 1-sized dimensions when determining if an array is C or Fortran contiguous. At runtime, loopy will stride check all the dimensions of an array. As a result, some arrays passed from numpy will fail loopy stride checks, even though they are compatible as inputs. As an example:

>>> matvec = lp.make_kernel("{[i,j]: 0 <= i <= n and 0 <= j <= m}",
... """
... a[i] = sum(j, A[i,j] * b[j])
... """)
>>> A = np.asarray(np.zeros((1, 10), order="F"), order="C")
>>> b = np.zeros(10)
>>> matvec(queue, A=A, b=b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/matt/src/loopy/loopy/kernel/__init__.py", line 1260, in __call__
    return kex(*args, **kwargs)
  File "/Users/matt/src/loopy/loopy/target/pyopencl_execution.py", line 351, in __call__
    out_host, **kwargs)
  File "<generated code>", line 120, in invoke_loopy_kernel_loopy_kernel
TypeError: strides mismatch on argument 'A' (got: (8, 8), expected: (80, 8))

Edited Jan 24, 2018 by Matt Wala