This project is a set of libraries designed to work with ldc to enable native execution of D on GPUs (and other more exotic targets of OpenCL such as FPGAs DSPs, hereafter just 'GPUs') on the OpenCL and CUDA runtimes.
There are four main parts:
- std: A library containing standard functionality for targetting GPUs and abstractions over the intrinsics of OpenCL and CUDA.
- driver: For handling all the compute API interactions and provide a friendly, easy-to-use, consistent interface. Of course you can always get down to a lower level of interaction if you need to.
- kernels: A set of standard kernels and primitives to cover a large number of use cases and serve as documentation on how (and how not) to use this library.
- tests: A framework for testing kernels. The suite is runnable with
dub test
(seedub.json
for the configuration used).
Kernel:
@kernel void saxpy(GlobalPointer!(float) res,
float alpha,
GlobalPointer!(float) x,
GlobalPointer!(float) y,
size_t N)
{
auto i = GlobalIndex.x;
if (i >= N) return;
res[i] = alpha*x[i] + y[i];
}
Invoke with (CUDA):
q.enqueue!(saxpy)
([N,1,1],[1,1,1]) // Block & grid & optional shared memory
(b_res,alpha,b_x,b_y, N); // kernel arguments
equivalent to the CUDA code
saxpy<<<N,1,0,q>>>(b_res,alpha,b_x,b_y, N);
For more examples and the full code see source/dcompute/tests
.
To build DCompute you will need:
- ldc as the D dcompiler.
- a SPIRV capable LLVM (available here to build ldc to to support SPIRV (required for OpenCL)).
- or LDC built with any LLVM 3.9.1 or greater that has the NVPTX backend enabled, to support CUDA.
- dub
and then just run
$dub build
or add"dcompute": "~>0.0.1"
to yourdub.json
ordependency "dcompute" version="~>0.0.1"
to yourdub.sdl
.
If you get an error like Error: unrecognized switch '-mdcompute-targets=cuda-210
, make sure you are using LDC and not DMD: passing --compiler=/path/to/ldc2
to dub will force it to use /path/to/ldc2
as the D compiler.
A dmd compatible d compiler,dmd, ldmd or gdmd (available as part of ldc and gdc respectively), and cmake for building ldc is also required if you need to build ldc yourself.
Please see the wiki.
Generate OpenCL builtins from here