Class: dlhdl.Workflow
Package: dlhdl
Run inference on deployed network and profile speed of neural network deployed on specified target device
predict(
predicts responses for the image
data in imds
)imds
by using the deep learning network that you specified in the
dlhdl.Workflow
class for deployment on the specified target board and
returns the results.
predict(
predicts responses for the image data in imds
, Name,Value
)imds
by using the deep learning
network that you specified by using the dlhdl.Workflow
class for deployment
on the specified target boards and returns the results, with one or more arguments specified
by optional name-value pair arguments.
Note
Before you run the predict
function, make sure that your host
computer is connected to the target device board. For more information, see Configure Board-Specific Setup Information .
Use this image to run the code:
% Save the pretrained SeriesNetwork object snet = vgg19; % Create a Target object and define the interface to the target board hTarget = dlhdl.Target('Intel'); % Create a workflow object for the SeriesNetwork and using the FPFA bitstream hW = dlhdl.Workflow('Network', snet, 'Bitstream', 'arria10soc_single','Target',hTarget); % Load input images and resize them according to the network specifications image = imread('zebra.jpeg'); inputImg = imresize(image, [224, 224]); imshow(inputImg); imIn = single(inputImg); % Deploy the workflow object hW.deploy; % Predict the outcome and optionally profile the results to measure performance. [prediction, speed] = hW.predict(imIn,'Profile','on'); [val, idx] = max(prediction); snet.Layers(end).ClassNames{idx}
### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 166206640 1.10804 1 166206873 0.9 conv_module 156100737 1.04067 conv1_1 2174602 0.01450 conv1_2 15580687 0.10387 pool1 1976185 0.01317 conv2_1 7534356 0.05023 conv2_2 14623885 0.09749 pool2 1171628 0.00781 conv3_1 7540868 0.05027 conv3_2 14093791 0.09396 conv3_3 14093717 0.09396 conv3_4 14094381 0.09396 pool3 766669 0.00511 conv4_1 6999620 0.04666 conv4_2 13725380 0.09150 conv4_3 13724671 0.09150 conv4_4 13725125 0.09150 pool4 465360 0.00310 conv5_1 3424060 0.02283 conv5_2 3423759 0.02283 conv5_3 3424758 0.02283 conv5_4 3424461 0.02283 pool5 113010 0.00075 fc_module 10105903 0.06737 fc6 8397997 0.05599 fc7 1370215 0.00913 fc8 337689 0.00225 * The clock frequency of the DL processor is: 150MHz ans = 'zebra'
Note
Before you run the predict
function, make sure that your host
computer is connected to the target device board. For more information, see Configure Board-Specific Setup Information .
Create a file in your current working directory called
getLogoNetwork.m
. Enter these lines into the file:
function net = getLogoNetwork() data = getLogoData(); net = data.convnet; end function data = getLogoData() if ~isfile('LogoNet.mat') url = 'https://www.mathworks.com/supportfiles/gpucoder/cnn_models/logo_detection/LogoNet.mat'; websave('LogoNet.mat',url); end data = load('LogoNet.mat'); end
Use this image to run the code:
To quantize the network, you need the products listed under FPGA
in
Quantization Workflow Prerequisites.
% Save the pretrained SeriesNetwork object snet = getLogoNetwork(); % Create a Target object and define the interface to the target board hTarget = dlhdl.Target('Xilinx','Interface','Ethernet'); % Create a Quantized Network Object dlquantObj = dlquantizer(snet,'ExecutionEnvironment','FPGA'); Image = imageDatastore('heineken.png','Labels','Heineken'); dlquantObj.calibrate(Image); % Create a workflow object for the SeriesNetwork and using the FPFA bitstream hW = dlhdl.Workflow('Network', dlquantObj, 'Bitstream', 'zcu102_int8','Target',hTarget); % Load input images and resize them according to the network specifications image = imread('heineken.png'); inputImg = imresize(image, [227, 227]); imshow(inputImg); imIn = single(inputImg); % Deploy the workflow object hW.deploy; % Predict the outcome and optionally profile the results to measure performance. [prediction, speed] = hW.predict(imIn,'Profile','on'); [val, idx] = max(prediction); snet.Layers(end).ClassNames{idx}
### Loading weights to FC Processor. ### FC Weights loaded. Current time is 12-Jun-2020 16:55:34 ### FPGA bitstream programming has been skipped as the same bitstream is already loaded on the target FPGA. ### Deep learning network programming has been skipped as the same network is already loaded on the target FPGA. ### Finished writing input activations. ### Running single input activations. Deep Learning Processor Profiler Performance Results LastLayerLatency(cycles) LastLayerLatency(seconds) FramesNum Total Latency Frames/s ------------- ------------- --------- --------- --------- Network 13604105 0.04535 1 13604146 22.1 conv_module 12033763 0.04011 conv_1 3339984 0.01113 maxpool_1 1490805 0.00497 conv_2 2866483 0.00955 maxpool_2 574102 0.00191 conv_3 2432474 0.00811 maxpool_3 700552 0.00234 conv_4 617505 0.00206 maxpool_4 11951 0.00004 fc_module 1570342 0.00523 fc_1 937715 0.00313 fc_2 599341 0.00200 fc_3 33284 0.00011 * The clock frequency of the DL processor is: 300MHz
calibrate
| compile
| deploy
| dlquantizationOptions
| dlquantizer
| estimate
| validate