Skip to content

Random123 alignment violation leads to crash

Here the code that copies a vector into an array of scalars:


              // output bulk
              unsigned long idx = get_global_id(0)*4;
              while (idx + 4 < out_size)
              {
                  *(global output_vec_t *) (output + idx) =
                      GET_RANDOM_NUM(RANLUX_FUNC(&ranluxclstate));
                  idx += 4*NUM_WORKITEMS;
              }

output is only guaranteed to be aligned to scalar boundary, but this is doing a vector assignment which may have stricter alignment rules. The following script will crash on my system (POCL 1.0 + AVX). For some reason, this only fails on Darwin and not on Linux.

import pyopencl as cl
import pyopencl.clrandom as clrandom
import numpy as np

if __name__ == "__main__":
    c = cl.create_some_context(0)
    rng = clrandom.PhiloxGenerator(c)
    q = cl.CommandQueue(c)
    rng.normal(q, 10, np.float64)

The stack trace indicates an alignment violation (the data should be aligned to 32 bytes):

(inteq) $ lldb python test.py 
(lldb) target create "python"
Current executable set to 'python' (x86_64).
(lldb) settings set -- target.run-args  "test.py"
(lldb) run
Process 93926 launched: '/Users/matt/miniconda3/envs/inteq/bin/python' (x86_64)
Process 93926 stopped
* thread #3, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
    frame #0: 0x000000018e7f69e0 rng_gen_philox4x32_normal.so`_pocl_launcher_rng_gen_philox4x32_normal + 560
rng_gen_philox4x32_normal.so`_pocl_launcher_rng_gen_philox4x32_normal:
->  0x18e7f69e0 <+560>: vmovapd %ymm0, (%r12)
    0x18e7f69e6 <+566>: addq   0xc8(%rbx), %r15
    0x18e7f69ed <+573>: addq   0xd0(%rbx), %r12
    0x18e7f69f4 <+580>: leaq   0x4(%r15), %rax
Target 0: (python) stopped.

There is also another spot in the same file that uses the vector-copy pattern and might also be susceptible to alignment violations.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information