Performance

Troubleshoot code generation issues, improve code execution time, and reduce memory usage of generated code

Some of the most common reasons why GPU Coder™ generated code is not performing as expected are:

  • CUDA® kernels are not created.

  • Host to device and device to host memory transfers (cudaMemcpy) are throttling performance.

  • Not enough parallelism or device issues.

These topics elaborate on the common causes for these symptoms and describe how to utilize the built-in screener to detect these issues. You can find information on how to work around for these issues and generate more efficient CUDA code.

Apps

GPU CoderGenerate GPU code from MATLAB code
Check GPU InstallVerify and set up the GPU code generation environment

Functions

codegenGenerate C/C++ code from MATLAB code
gpucoderOpen GPU Coder app
coder.gpu.kernelPragma that maps for-loops to GPU kernels
coder.gpu.kernelfunPragma that maps function to GPU kernels
coder.gpu.nokernelPragma to disable kernel creation for loops
gpucoder.profileCreate an execution profile report for the generated CUDA code

Objects

coder.gpuConfigConfiguration parameters for CUDA code generation from MATLAB code by using GPU Coder
coder.CodeConfigConfiguration parameters for C/C++ code generation from MATLAB code
coder.EmbeddedCodeConfigConfiguration parameters for C/C++ code generation from MATLAB code with Embedded Coder
coder.gpuEnvConfigCreate configuration object containing the parameters passed to coder.checkGpuInstall for performing GPU code generation environment checks

Topics

Workflow

GPU Coder troubleshooting workflow.

Code Generation Reports

Create and view reports generated during code generation.

Trace Between Generated CUDA Code and MATLAB Source Code

Highlight sections of MATLAB code that runs on the GPU.

Generating a GPU Code Metrics Report for Code Generated from MATLAB Code

Create and explore GPU static code metrics report.

Kernel Analysis

Recommendations for generating efficient CUDA kernels.

Memory Bottleneck Analysis

Reduce memory bottleneck issues when using GPU Coder.

Analyze Execution Profiles of the Generated Code

Fine-grain profiling for the MATLAB algorithm and its generated CUDA code through SIL.

Analysis with NVIDIA Profiler

Improve performance by using the information obtained from NVIDIA Profiler (nvvp).

GPU Coder Limitations

See current limitations of GPU Coder.

Featured Examples