Configuration parameters for CUDA code generation from MATLAB code by using GPU Coder
The coder.gpuConfig
object contains the configuration
parameters that codegen
uses for generating CUDA® MEX, a static library, a dynamically linked library, or an executable
program with GPU Coder™. Pass the object to the codegen
function by using the
-config
option.
cfg = coder.gpuConfig(
creates a code generation configuration object for the specified build type,
which can be CUDA MEX, a static library, a dynamically linked library, or an
executable program. If the Embedded
Coder® product is installed, it creates a build_type
)coder.EmbeddedCodeConfig
object for static library, dynamic
library, or executable build types.
cfg = coder.gpuConfig(
creates a code generation configuration object to generate CUDA
build_type
,'ecoder',false)'lib'
, 'dll'
, or 'exe'
output even if the Embedded
Coder product is installed.
cfg = coder.gpuConfig(
creates a build_type
,'ecoder',true)coder.EmbeddedCodeConfig
configuration object even if the Embedded
Coder product is not installed. However, code generation using a
coder.EmbeddedCodeConfig
object requires an Embedded
Coder license.
build_type
— Output to build from generated CUDA C/C++ code'MEX'
| 'LIB'
| 'DLL'
| 'EXE'
Output to build from generated CUDA C/C++ code, specified as one of the values in this table.
Value | Description |
---|---|
'MEX' | CUDA MEX |
'LIB' | Static library |
'DLL' | Dynamically linked library |
'EXE' | Executable program |
coder.GpuConfig
contains only GPU specific configuration parameters
of the code configuration object. To see all the properties of the code configuration
object, see coder.CodeConfig
and coder.EmbeddedCodeConfig
.
Enabled
— Control GPU code generationtrue
(default) | false
Control generation of CUDA (*.cu) files by using one of the values in this table.
Value | Description |
---|---|
true | This value is the default value. Enables CUDA code generation. |
false | Disables CUDA code generation. |
Example: cfg.GpuConfig.Enabled = true
MallocMode
— GPU memory allocation'discrete'
(default) | 'unified'
Memory allocation (malloc
) mode to be used in the
generated CUDA code, specified as one of the values in this
table.
Value | Description |
---|---|
'discrete' | This value is the default value. The generated code uses the
|
'unified' | The generated code uses the
|
For more information, see Discrete and Managed Modes.
Example: cfg.GpuConfig.MallocMode =
'discrete'
KernelNamePrefix
— Custom kernel name prefixesSpecify a custom name prefix for all the kernels in the generated code.
For example, using the value 'CUDA_'
creates kernels with
names CUDA_kernel1
, CUDA_kernel2
, and
so on. If no name is provided, GPU Coder prepends the kernel name with the name of the entry-point
function. Kernel names can contain upper-case letters, lowercase letters,
digits 0–9, and underscore character _. GPU Coder removes unsupported characters from the kernel names and
appends alpha
to prefixes that do not begin with an
alphabetic letter.
Example: cfg.GpuConfig.KernelNamePrefix =
'myKernel'
EnableCUBLAS
— Use cuBLAS
librarytrue
(default) | false
Replacement of math function calls with NVIDIA®
cuBLAS
library calls, specified as one of the values in
this table.
Value | Description |
---|---|
true | This value is the default value. Allows GPU Coder to replace appropriate math function
calls with calls to the |
false | Disable the use of the
|
For more information, see Kernels from Library Calls.
Example: cfg.GpuConfig.EnableCUBLAS =
true
EnableCUSOLVER
— Use cuSOLVER
librarytrue
(default) | false
Replacement of math function calls with NVIDIA
cuSOLVER
library calls, specified as one of the values in
this table.
Value | Description |
---|---|
true | This value is the default value. Allows GPU Coder to replace appropriate math function
calls with calls to the |
false | Disable the use of the
|
For more information, see Kernels from Library Calls.
Example: cfg.GpuConfig.EnableCUSOLVER =
true
EnableCUFFT
— Use cuFFT
librarytrue
(default) | false
Replacement of fft
function calls with NVIDIA
cuFFT
library calls, specified as one of the values in
this table.
Value | Description |
---|---|
true | This value is the default value. Allows GPU Coder to replace appropriate
|
false | Disables use of the |
For more information, see Kernels from Library Calls.
Example: cfg.GpuConfig.EnableCUFFT = true
Benchmarking
— Add benchmarking to the generated codefalse
(default) | true
Control addition of benchmarking code to the generated CUDA code by using one of the values in this table.
Value | Description |
---|---|
false | This value is the default value. The generated CUDA code does not contain benchmarking functionality. |
true | Generates CUDA code with benchmarking functionality.
This option uses CUDA APIs such as
|
Example: cfg.GpuConfig.Benchmarking =
true
SafeBuild
— Error checking in the generated codefalse
(default) | true
Add error-checking functionality to the generated CUDA code by using one of the values in this table.
Value | Description |
---|---|
false | This value is the default value. The generated CUDA code does not contain error-checking functionality. |
true | Generates code with error-checking for CUDA API and kernel calls. |
Example: cfg.GpuConfig.SafeBuild = true
ComputeCapability
— Minimum compute capability for code generation'3.5'
(default) | '3.2'
| '3.7'
| '5.0'
| '5.2'
| '5.3'
| '6.0'
| '6.1'
| '6.2'
| '7.0'
| '7.1'
| '7.2'
Select the minimum compute capability for code generation. The compute capability identifies the features supported by the GPU hardware. It is used by applications at run time to determine which hardware features, instructions are available on the present GPU. If you specify custom compute capability, GPU Coder ignores this setting.
Example: cfg.GpuConfig.ComputeCapability =
'6.1'
CustomComputeCapability
— Control GPU code generation''
(default) | character vectorSpecify the name of the NVIDIA virtual GPU architecture for which the CUDA input files must be compiled.
For example, to specify a virtual architecture type
-arch=compute_50
. You can specify a real architecture
using -arch=sm_50
. For more information, see the
Options for Steering GPU Code Generation topic in
the CUDA toolkit documentation.
Example: cfg.GpuConfig.CustomComputeCapability =
'-arch=compute_50'
CompilerFlags
— Additional flags to the GPU compiler''
(default) | character vector
Pass additional flags to the GPU compiler. For example,
--fmad=false
instructs the nvcc
compiler to disable contraction of floating-point multiply and add to a
single Floating-Point Multiply-Add (FMAD) instruction.
For similar NVIDIA compiler options, see the topic on NVCC Command Options in the CUDA toolkit documentation.
Example: cfg.GpuConfig.CompilerFlags =
'--fmad=false'
StackLimitPerThread
— Stack limit per GPU thread1024
(default) | integer
Specify the maximum stack limit per GPU thread as an integer value.
Example: cfg.GpuConfig.StackLimitPerThread =
1024
MallocThreshold
— Malloc threshold200
(default) | integer
Specify the size above which the private variables are allocated on the heap instead of the stack, as an integer value.
Example: cfg.GpuConfig.MallocThreshold =
256
SelectCudaDevice
— CUDA device selection-1
(default) | deviceID
In a multi GPU environment such as NVIDIA Drive platforms, specify the CUDA device to target.
Example: cfg.GpuConfig.SelectCudaDevice =
<DeviceID>
Generate CUDA MEX function from a MATLAB function that is suitable for GPU code generation. Also, enable a code generation report.
Write a MATLAB function VecAdd
, that performs vector
addition of inputs A
and B
.
function [C] = VecAdd(A,B) %#codegen C = coder.nullcopy(zeros(size(A))); coder.gpu.kernelfun(); C = A + B; end
To generate a MEX function, create a code generation configuration object.
cfg = coder.gpuConfig('mex');
Enable the code generation report.
cfg.GpuConfig.EnableCUBLAS = true; cfg.GenerateReport = true;
Generate a MEX function in the current folder specifying the configuration
object using the -config
option.
% Generate a MEX function and code generation report codegen -config cfg -args {zeros(512,512,'double'),zeros(512,512,'double')} VecAdd
GPU Coder always sets the PassStructByReference
property
of the code configuration object to true.
codegen
| coder.CodeConfig
| coder.EmbeddedCodeConfig
| coder.MexCodeConfig