Custom Deep Learning Processor Generation to Meet Performance Requirements

This example shows how to create a custom processor configuration and estimate the performance of a pretrained series network. You can then modify parameters of the custom processor configuration and re-estimate the performance. Once you have achieved your performance requirements you can generate a custom bitstream by using the custom processor configuration.

Load Pretrained Series Network

To load the pretrained series network LogoNet, enter:

snet = getLogoNetwork;

Create Custom Processor Configuration

To create a custom processor configuration, use the dlhdl.ProcessorConfig object. For more information, see dlhdl.ProcessorConfig. To learn about modifiable parameters of the processor configuration, see getModuleProperty and setModuleProperty.

hPC = dlhdl.ProcessorConfig;
hPC.TargetFrequency = 220;

Create Workflow Object

Create a dlhdl.Workflow object. Specify snet as the network and hPC as the ProcessorConfig.

hW = dlhdl.Workflow('Network',snet,'ProcessorConfig',hPC)

Estimate LogoNet Performance

To estimate the performance of the LogoNet series network, use the estimate function of the dlhdl.Workflow object. The function returns the estimated layer latency, network latency, and network performance in frames per second (Frames/s).

hW.estimate('Performance')

The output of the estimate function is:

The estimated frames per second is 5.6 Frames/s. To improve the network performance, modify the custom processor convolution module kernel data type, convolution processor thread number, fully connected module kernel data type, and fully connected module thread number. For more information about these processor parameters, see getModuleProperty and setModuleProperty.

Create Modified Custom Processor Configuration

To create a custom processor configuration, use the dlhdl.ProcessorConfig object. For more information, see dlhdl.ProcessorConfig. To learn about modifiable parameters of the processor configuration, see getModuleProperty and setModuleProperty.

hPCNew = dlhdl.ProcessorConfig;
hPC.TargetFrequency = 300;
hPCNew.setModuleProperty('conv', 'KernelDataType',   'int8');
hPCNew.setModuleProperty('conv', 'ConvThreadNumber', 64);
hPCNew.setModuleProperty('fc', 'KernelDataType',   'int8');
hPCNew.setModuleProperty('fc', 'FCThreadNumber',   16);

Quantize LogoNet Series Network

To estimate the performance of the LogoNet series network by using the new custom processor configuration, quantize the LogoNet network. For more information, see Estimate Performance of Quantized LogoNet Running On ZCU102 Bitstream. Use the quantized network object dlquantObj to estimate performance by using the new custom processor configuration.

Create Workflow Object

Create a dlhdl.Workflow object. Specify dlQuantObj as the network and hPC as the ProcessorConfig.

hW = dlhdl.Workflow('Network',dlquantObj,'ProcessorConfig',hPCNew)

Estimate LogoNet Performance

To estimate the performance of the LogoNet series network, use the estimate function of the dlhdl.Workflow object. The function returns the estimated layer latency, network latency, and network performance in frames per second (Frames/s).

hW.estimate('Performance')

The output of the estimate function is:

The estimated frames per second is 21.5 Frames/s.

Generate Custom Processor and Bitstream

Use the new custom processor configuration to build and generate a custom processor and bitstream. Use the custom bitstream to deploy the LogoNet network to your target FPGA board.

hdlsetuptoolpath('ToolName', 'Xilinx Vivado', 'ToolPath', 'C:\Xilinx\Vivado\2019.2\bin\vivado.bat');
dlhdl.buildProcessor(hPCNew);

To learn how to use the generated bitstream file, see Generate Custom Bitstream.

The generated bitstream in this example is similar to the zcu102_int8 bitstream. To deploy the quantized LogoNet network using the zcu102_int8 bitstream, see Obtain Prediction Results for Quantized LogoNet Network.