Generate code and build static library for Series or DAG Network
cnncodegen(
generates CUDA® C++ code and builds a static library for the specified
network object and target library by using default values for
all properties.net
,'targetlib',libraryname
)
cnncodegen(
generates CUDA C++ code and builds a static library for the specified
network object and target library with additional code generation options specified
by one or more net
,'targetlib',libraryname
,Name,Value
)Name,Value
pair arguments.
Use cnncodegen
to generate C++ code for
a pretrained network for deployment to an ARM® processor.
Get the pretrained GoogLeNet model by using the googlenet
(Deep Learning Toolbox) function. This function requires the Deep Learning Toolbox™ Model for GoogLeNet Network. If you have not installed this support package, the function
provides a download link. Alternatively, see https://www.mathworks.com/matlabcentral/fileexchange/64456-deep-learning-toolbox-model-for-googlenet-network.
net = googlenet;
Generate code by using cnncodegen
with
'targetlib'
set to 'arm-compute'
.
For 'arm-compute'
, you must provide the
'ArmArchitecture'
parameter.
cnncodegen(net,'targetlib','arm-compute'... ,'targetparams',struct('ArmComputeVersion','19.02','ArmArchitecture','armv8'));
Generate CUDA C++ code from a SeriesNetwork
object created
for the YOLO architecture, trained for classifying the PASCAL dataset. This
example requires the GPU Coder™ product and GPU Coder Interface for Deep Learning Libraries.
Get the pretrained YOLO network and convert it into a
SeriesNetwork
object.
url = 'https://www.mathworks.com/supportfiles/gpucoder/cnn_models/Yolo/yolonet.mat'; websave('yolonet.mat',url); net = coder.loadDeepLearningNetwork('yolonet.mat');
The SeriesNetwork
object net
contains 58 layers. These layers are convolution layers followed by leaky
ReLU
and fully connected layers at the end of the
network architecture. You can use net.Layers
to see the
all the layers in this network.
Use the cnncodegen
function to generate CUDA code.
cnncodegen(net,'targetlib','cudnn');
The code generator generates the .cu
and header files
in the '/pwd/codegen'
folder. The series network is
generated as a C++ class called CnnMain
, containing an
array of 58 layer classes. The setup()
method of this
class sets up handles and allocates resources for each layer object. The
predict()
method invokes prediction for each of the
58 layers in the network. The cleanup()
method releases
all the memory and system resources allocated for each layer object. All the
binary weights (cnn_**_w
) and the bias files
(cnn_**_b
) for the convolution layers of the network
are stored in the codegen
folder. The files are compiled
into the static library cnnbuild.a
(on Linux®) or cnnbuild.lib
(on Windows®).
net
— Name of the series or DAG network objectPretrained SeriesNetwork
or
DAGNetwork
object.
libraryname
— Deep learning target libraryThe target library and the target platform to generate code for, specified as one of the values in this table.
Value | Description |
---|---|
'arm-compute' | Target an ARM CPU processor supporting
Requires the MATLAB® Coder™ Interface for Deep Learning Libraries. |
'arm-compute-mali' | Target an ARM GPU processor by using the ARM Compute Library for computer vision and machine learning. Requires the GPU Coder product and the GPU Coder Interface for Deep Learning Libraries. |
'cudnn' | Target NVIDIA® GPUs by using the CUDA Deep Neural Network library (cuDNN). Requires the GPU Coder product and the GPU Coder Interface for Deep Learning Libraries. |
'mkldnn' | Target Intel® CPU processor by using the Intel Math Kernel Library for Deep Neural Networks (MKL-DNN). Requires the MATLAB Coder Interface for Deep Learning Libraries. |
'tensorrt' | Target NVIDIA GPUs by using NVIDIA TensorRT™, a high performance deep learning inference optimizer and run-time library. Requires the GPU Coder product and the GPU Coder Interface for Deep Learning Libraries. |
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
cnncodegen(net,'targetlib','mkldnn','codegenonly',0,'batchsize',1)
generates C++ code for the
Intel
processor by using MKL-DNN and builds a static library for the network object in
net
.'batchsize'
— SizeA positive nonzero integer value specifying the number of observations
to operate on in a single call to the network
predict()
method. When calling
network->predict()
, the size of the input data
must match the batchsize
value specified during
cnncodegen
.
If libraryname
is
'arm-compute'
or
'arm-compute-mali'
, the value of
batchsize
must be 1
.
'codegenonly'
— Option to generate only codeBoolean flag that, when enabled, generates CUDA C++ code without generating and building a makefile.
'targetparams'
— Library-specific parametersLibrary-specific parameters specified as a
1
-by-1
structure containing
the fields described in these tables.
Parameters for ARM Compute Library (CPU)
Field | Description |
---|---|
ArmComputeVersion | Version of ARM Compute Library on the target
hardware, specified as |
ArmArchitecture | ARM architecture supported on the target
hardware, specified as
|
Parameters for ARM Compute Library (Mali GPU)
Field | Description |
---|---|
ArmComputeVersion | Version of ARM Compute Library on the target
hardware, specified as |
Parameters for NVIDIA cuDNN Library
Field | Description |
---|---|
AutoTuning | Enable or disable auto tuning feature. Enabling
auto tuning allows the cuDNN library to find the
fastest convolution algorithms. This increases
performance for larger networks such as SegNet and
ResNet. Default value is
Note If |
DataType | Specify the precision of the tensor data type
input to the network. When performing inference in
32-bit floats, use The
|
CalibrationResultFile | Location of the MAT-file containing the
calibration data. Default value is
|
Parameters for NVIDIA TensorRT Library
Field | Description |
---|---|
DataType | Specify the precision of the tensor data type
input to the network or the tensor output of a
layer. When performing inference in 32-bit floats,
use The
The
|
DataPath | Location of the image dataset used during
recalibration. Default value is
When you
select the |
NumCalibrationBatches | Numeric value specifying the number of batches
for NVIDIA recommends that about 500 images are sufficient for calibrating. Refer to the TensorRT documentation for more information. |
'computecapability'
— Compute versionThis property affects GPU targeting only.
Character vector or string scalar specifying the NVIDIA GPU compute capability to compile for. Argument takes the
format of major#.minor#
.
Possible values are
'3.2'|'3.5'|'3.7'|'5.0'|'5.2'|'5.3'|'6.0'|'6.1'|'6.2'|'7.0'|'7.1'|'7.2'
.
Default value is '3.5'
.
Behavior change in future release
In a future release, the cnncodegen
function will generate C++ code and build a static
library for only the ARM Mali GPU processor. You can continue to use the
'arm-compute-mali'
value for the
'targetlib'
argument to target an ARM Mali GPU by using the ARM Compute Library for computer vision and machine learning.
For all other targets, use the codegen
command.