Cross-Compile Deep Learning Code That Uses ARM Compute Library

On the computer that hosts your MATLAB^® session, you can generate deep learning source code and compile it to create a library or an executable that runs on a target ARM^® hardware device. The compilation of source code on one platform to create binary code for another platform is known as cross-compilation. This workflow is supported only for the Linux^® host platform and target devices that have armv7 (32-bit) or armv8 (64-bit) ARM architecture.

Use this workflow to deploy deep learning code on ARM devices that do not have hardware support packages.

Prerequisites

These are the prerequisites specific to the cross-compilation workflow. For the general prerequisites, see Prerequisites for Deep Learning with MATLAB Coder.

The target device must have armv7 (32-bit) or armv8 (64-bit) ARM architecture. To verify the architecture of your device run this command in the terminal of the device:
```
arch
```
You must have the Linaro AArch32 or AArch64 toolchain installed on the host computer.
- For armv7 target, install the GNU/GCC g++-arm-linux-gnueabihf toolchain on the host.
- For armv8 target, install the GNU/GCC g++-aarch64-linux-gnu toolchain on the host.
For example, to install the Linaro AArch64 toolchain on the host, run this command in the terminal:
```
sudo apt-get install g++-aarch64-linux-gnu
```
At the MATLAB command line, set the environment variable LINARO_TOOLCHAIN_AARCH32 or LINARO_TOOLCHAIN_AARCH64 for the path of the toolchain binaries. You must set the path once per MATLAB session.
Suppose that the toolchain is installed at the location /usr/bin in the host.
- For armv7 target, run this command:
  setenv('LINARO_TOOLCHAIN_AARCH32', '/usr/bin')
- For armv8 target, run this command:
  setenv('LINARO_TOOLCHAIN_AARCH64', '/usr/bin')
Cross-compile the ARM Compute library on the host:
- Clone the Git™ repository for ARM Compute library and check out the version you need. For example, to check out v19.05, run these commands in the host terminal:
```
git clone https://github.com/Arm-software/ComputeLibrary.git
cd ComputeLibrary
git tag -l
git checkout v19.05
```
- Install scons on the host. For example, run this commands in the host terminal:
```
sudo apt-get install scons
```
- Use scons to cross-compile the ARM Compute library on host. For example, to build the library to run on armv8 architecture, run this command in the host terminal:
```
scons Werror=0 -j8 debug=0 neon=1 opencl=0 os=linux arch=arm64-v8a openmp=1 cppthreads=1 examples=0 asserts=0 build=cross_compile
```
- At the MATLAB command line, set the environment variable ARM_COMPUTELIB for the path of the ARM Compute library. You must set the path once per MATLAB session.
  Suppose that the ARM Compute library is installed at the location /home/$(USER)/Desktop/ComputeLibrary. Run this command at the MATLAB command line:
  setenv('ARM_COMPUTELIB','/home/$(USER)/Desktop/ComputeLibrary')

Generate and Deploy Deep Learning Code

There are two possible workflows for cross-compiling deep learning code on your host computer and then deploying the code on target ARM hardware. Here is a summary of the two workflows. For an example that demonstrates both workflows, see Cross Compile Deep Learning Code for ARM Neon Targets.

On the host computer, you generate a static or dynamic library for deep learning code. Follow these steps:
- On the host, use the codegen command to generate and build deep learning code to create a static or dynamic library.
- Copy the generated library, the ARM Compute library files, the makefile, and other supporting files to the target hardware.
- Compile the copied makefile on the target to create an executable.
- Run the generated executable on the target hardware.
On the host computer, you generate an executable for deep learning code. Follow these steps:
- On the host, use the codegen command to generate and build deep learning code to create an executable.
- Copy the generated executable, the ARM Compute library files, and other supporting files to the target hardware.
- Run the executable on the target hardware.

Documentation