Some of the most common reasons why GPU Coder™ generated code is not performing as expected are:
CUDA® kernels are not created.
Host to device and device to host memory transfers
(cudaMemcpy
) are throttling
performance.
Not enough parallelism or device issues.
These topics elaborate on the common causes for these symptoms and describe how to utilize the built-in screener to detect these issues. You can find information on how to work around for these issues and generate more efficient CUDA code.
GPU Coder | Generate GPU code from MATLAB code |
Check GPU Install | Verify and set up the GPU code generation environment |
codegen | Generate C/C++ code from MATLAB code |
gpucoder | Open GPU Coder app |
coder.gpu.kernel | Pragma that maps for -loops to GPU kernels |
coder.gpu.kernelfun | Pragma that maps function to GPU kernels |
coder.gpu.nokernel | Pragma to disable kernel creation for loops |
gpucoder.profile | Create an execution profile report for the generated CUDA code |
coder.gpuConfig | Configuration parameters for CUDA code generation from MATLAB code by using GPU Coder |
coder.CodeConfig | Configuration parameters for C/C++ code generation from MATLAB code |
coder.EmbeddedCodeConfig | Configuration parameters for C/C++ code generation from MATLAB code with Embedded Coder |
coder.gpuEnvConfig | Create configuration object containing the parameters passed to
coder.checkGpuInstall for performing GPU code generation environment
checks |
GPU Coder troubleshooting workflow.
Create and view reports generated during code generation.
Trace Between Generated CUDA Code and MATLAB Source Code
Highlight sections of MATLAB code that runs on the GPU.
Generating a GPU Code Metrics Report for Code Generated from MATLAB Code
Create and explore GPU static code metrics report.
Recommendations for generating efficient CUDA kernels.
Reduce memory bottleneck issues when using GPU Coder.
Analyze Execution Profiles of the Generated Code
Fine-grain profiling for the MATLAB algorithm and its generated CUDA code through SIL.
Improve performance by using the information obtained from NVIDIA Profiler (nvvp).
See current limitations of GPU Coder.