Skip to content
Commit 45f06086 authored by Vladimir Rutsky's avatar Vladimir Rutsky
Browse files

add option to compile nvcc's fatbin

Fatbin embeds prebuild for set of specified real architectures cubin's
and PTX assemblies for set of specified virtual architectures, allowing
driver to load prebuild cubin if there is any for current real GPU, or
to assemble PTX from closest virtual GPU architecture.

This can be used for distributing CUDA-powered applications without need
to install NVidia CUDA Toolkit (nvcc) and development environment on
target platform.

.cu files can be precompiled to fatbin's on build server with

    fatbin = pycuda.compiler.compile(
        cu_file_text,
        options=[
            "-gencode", "arch=compute_20,code=compute_20",
            "-gencode", "arch=compute_20,code=sm_20",
            "-gencode", "arch=compute_30,code=compute_30",
            "-gencode", "arch=compute_30,code=sm_30",
	],
        target="fatbin")

fatbin's can be distributed on machines without nvcc and loaded using

    cuda_module = pycuda.driver.module_from_buffer(fatbin)
parent 52fe3956
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment