Support for GPU Arrays

You can use GPU arrays as input and output arguments to an entry-point function when generating CUDA® MEX, source code, static libraries, dynamic libraries, and executables. Depending on whether a given input to the entry-point function is identified as CPU or GPU based input and depending on the usage of the variable (used on the GPU or on the CPU) cudaMemcpy calls are inserted efficiently in the generated code. By using the GPU array functionality you can minimize the number of cudaMemcpy calls in the generated code.

To use this functionality, do one of the following:

  • Use coder.typeof to represent the gpuArray type of an entry-point function input. For example:

    coder.typeof(rand(20),'Gpu',true);
    

  • Use the gpuArray function. For example:

    in = gpuArray(rand(1,10)); 
    codegen -config cfg -args {in} test
    

Considerations

  • GPU Coder™ supports all numeric and logical types. char and half data types are not supported. For using variable dimension arrays, only the bounded types are supported. Scalar GPU arrays, structures, cell-arrays, classes, enumerated types, and fixed-point data types are not supported.

  • The code generator supports all target types for GPU arrays - 'mex', 'lib', 'dll', and 'exe'. For 'lib', 'dll', and 'exe' targets, you must pass the correct pointers to the entry-point function in the example main function. For example, if an input is marked as 'Gpu', a GPU pointer should be passed when the entry-point is called from main function. Software-In-the-Loop (SIL) is supported for 'lib' and 'dll'.

  • The memory allocation (malloc) mode property of the code configuration object must be set to to be 'discrete'. For example,

    cfg.GpuConfig.MallocMode = 'discrete';
    
    GPU arrays are not supported in the 'unified' memory mode.

  • During code generation, If one input to entry-point function is of the GPU array, then the output variables are all GPU array types, provided they are supported for GPU code generation. For example. if the entry-point function returns a struct and because struct is not supported, the generated code returns a CPU output. However, if a supported matrix type is returned, then the generated code returns a GPU output.

See Also

| | | |

Related Topics