With MATLAB®
Coder™, you can generate code for prediction from an already trained convolutional
neural network (CNN), targeting an embedded platform that uses an ARM® processor that supports the NEON extension. The code generator takes advantage
of the ARM
Compute
Library for computer vision and machine learning. The generated code implements a CNN
that has the architecture, layers, and parameters specified in the input SeriesNetwork
(Deep Learning Toolbox) or
DAGNetwork
(Deep Learning Toolbox) network
object.
Generate code by using one of these methods:
MATLAB Coder Interface for Deep Learning Libraries. To install the support package, select it from the MATLAB Add-Ons menu.
ARM Compute Library for computer vision and machine learning must be installed on the target hardware.
Deep Learning Toolbox™.
Environment variables for the compilers and libraries.
For supported versions of libraries and for information about setting up environment variables, see Prerequisites for Deep Learning with MATLAB Coder.
codegen
To generate code for deep learning on an ARM target by using codegen
:
Write an entry-point function that loads the pretrained CNN and calls
predict
. For
example:
function out = squeezenet_predict(in) %#codegen persistent net; opencv_linkflags = '`pkg-config --cflags --libs opencv`'; coder.updateBuildInfo('addLinkFlags',opencv_linkflags); if isempty(net) net = coder.loadDeepLearningNetwork('squeezenet', 'squeezenet'); end out = net.predict(in); end
If your target hardware is Raspberry Pi™, you can take advantage of the MATLAB Support Package for Raspberry Pi Hardware. With the support package, codegen
moves the
generated code to the Raspberry Pi and builds the executable program on the Raspberry Pi. When you generate
code for a target that does not have a hardware support package, you must run commands
to move the generated files and build the executable program.
MEX generation is not supported for code generation for deep learning on ARM targets.
For ARM, for inputs to predict
(Deep Learning Toolbox) with
multiple images or observations (N > 1
), a
MiniBatchSize
of greater than 1 is not supported. Specify a
MiniBatchSize
of 1.
When you have the MATLAB Support Package for Raspberry Pi Hardware, to generate code for deep learning on a Raspberry Pi:
To connect to the Raspberry Pi, use raspi
(MATLAB Support Package for Raspberry Pi Hardware). For
example:
r = raspi('raspiname','username','password');
Create a code generation configuration object for a library or executable by
using coder.config
. Set the
TargetLang
property to
'C++'
.
cfg = coder.config('exe'); cfg.TargetLang = 'C++';
Create a deep learning configuration object by using coder.DeepLearningConfig
. Set the ArmComputeVersion
and ArmArchitecture
properties. Set the
DeepLearningConfig
property of the code generation
configuration object to the coder.ARMNEONConfig
object. For
example:
dlcfg = coder.DeepLearningConfig('arm-compute'); dlcfg.ArmArchitecture = 'armv7'; dlcfg.ArmComputeVersion = '19.05'; cfg.DeepLearningConfig = dlcfg;
To configure code generation hardware settings for the Raspberry Pi, create a coder.Hardware
object, by using coder.hardware
. Set the Hardware
property of the
code generation configuration object to the coder.Hardware
object
.
hw = coder.hardware('Raspberry Pi');
cfg.Hardware = hw;
If you are generating an executable program, provide a C++ main program. For example:
cfg.CustomSource = 'main.cpp';
To generate code, use codegen
. Specify the code generation
configuration object by using the -config
option. For
example:
codegen -config cfg squeezenet_predict -args {ones(227, 227, 3,'single')} -report
Note
You can specify half-precision inputs for code generation. However, the code generator type casts the inputs to single-precision. The Deep Learning Toolbox uses single-precision, floating-point arithmetic for all computations in MATLAB.
To generate code for deep learning when you do not have a hardware support package for the target:
Generate code on a Linux® host only.
Create a configuration object for a library. For example:
cfg = coder.config('lib');
Do not use a configuration object for an executable program.
Configure code generation to generate C++ code and to generate source code only.
cfg.GenCodeOnly = true;
cfg.TargetLang = 'C++';
To specify code generation with the ARM Compute Library, create a coder.ARMNEONConfig
object by using coder.DeepLearningConfig
. Set the ArmComputeVersion
and ArmArchitecture
properties. Set the
DeepLearningConfig
property of the code generation
configuration object to the coder.ARMNEONConfig
object.
dlcfg = coder.DeepLearningConfig('arm-compute'); dlcfg.ArmArchitecture = 'armv7'; dlcfg.ArmComputeVersion = '19.05'; cfg.DeepLearningConfig = dlcfg;
To configure code generation parameters that are specific to the target
hardware, set the ProdHWDeviceType
property of the
HardwareImplementation
object.
For the ARMv7 architecture, use 'ARM Compatible->ARM
Cortex'
.
for the ARMv8 architecture, use 'ARM Compatible->ARM 64-bit
(LP64)'
.
For example:
cfg.HardwareImplementation.ProdHWDeviceType = 'ARM Compatible->ARM 64-bit (LP64)';
To generate code, use codegen
. Specify the code generation
configuration object by using the -config
option. For
example:
codegen -config cfg squeezenet_predict -args {ones(227, 227, 3, 'single')} -d arm_compute
For an example, see Code Generation for Deep Learning on ARM Targets.
The series network is generated as a C++ class containing an array of layer classes.
class b_squeezenet_0 { public: int32_T batchSize; int32_T numLayers; real32_T *inputData; real32_T *outputData; MWCNNLayer *layers[68]; private: MWTargetNetworkImpl *targetImpl; public: b_squeezenet_0(); void presetup(); void postsetup(); void setup(); void predict(); void cleanup(); real32_T *getLayerOutput(int32_T layerIndex, int32_T portIndex); ~b_squeezenet_0(); };
The setup()
method of the class sets up handles and allocates
memory for each layer of the network object. The predict()
method
invokes prediction for each of the layers in the network. Suppose that you generate code
for an entry-point function, squeezenet_predict
. In the generated"for
you" file, squeezenet_predict.cpp
, the entry-point function
squeeznet_predict()
constructs a static object of b_squeezenet_0 class type and invokes setup
and predict
on the network object.
static b_squeezenet_0 net; static boolean_T net_not_empty; // Function Definitions // // A persistent object net is used to load the DAGNetwork object. // At the first call to this function, the persistent object is constructed and // set up. When the function is called subsequent times, the same object is reused // to call predict on inputs, avoiding reconstructing and reloading the // network object. // Arguments : const real32_T in[154587] // real32_T out[1000] // Return Type : void // void squeezenet_predict(const real32_T in[154587], real32_T out[1000]) { // Copyright 2018 The MathWorks, Inc. if (!net_not_empty) { DeepLearningNetwork_setup(&net); net_not_empty = true; } DeepLearningNetwork_predict(&net, in, out); }
Binary files are exported for layers that have parameters, such as fully connected and
convolution layers in the network. For example, the files with names having the pattern
cnn_squeezenet_*_w
and cnn_squeezenet_*_b
correspond to weights and bias parameters for the convolution layers in the
network.
cnn_squeezenet_conv10_b cnn_squeezenet_conv10_w cnn_squeezenet_conv1_b cnn_squeezenet_conv1_w cnn_squeezenet_fire2-expand1x1_b cnn_squeezenet_fire2-expand1x1_w cnn_squeezenet_fire2-expand3x3_b cnn_squeezenet_fire2-expand3x3_w cnn_squeezenet_fire2-squeeze1x1_b cnn_squeezenet_fire2-squeeze1x1_w ...
Complete the Select Source Files and Define Input Types steps.
Go to the Generate Code step. (Skip the Check for Run-Time Issues step because MEX generation is not supported for code generation with the ARM Compute Library.)
Set Language to C++.
Specify the target ARM hardware.
If your target hardware is Raspberry Pi and you installed the MATLAB Support Package for Raspberry Pi Hardware:
For Hardware Board, select Raspberry
Pi
.
To access the Raspberry Pi settings, click More Settings. Then, click Hardware. Specify the Device Address, Username, Password, and Build directory.
When you do not have a support package for your ARM target:
Make sure that Build type is Static
Library
or Dynamic Library
and select
the Generate code only check box.
For Hardware Board, select None - Select
device below
.
For Device vendor, select ARM
Compatible
.
For the Device type:
For the ARMv7 architecture, select ARM
Cortex
.
For the ARMv8 architecture, select ARM 64-bit
(LP64)
.
Note
If you generate code for deep learning on an ARM target, and do not use a hardware support package, generate code on a Linux host only.
In the Deep Learning pane, set Target
library to ARM Compute
. Specify ARM
Compute Library version and ARM Compute
Architecture.
Generate the code.
coder.ARMNEONConfig
| coder.DeepLearningConfig
| coder.loadDeepLearningNetwork