Logo Recognition Network

This example shows code generation for a logo classification application that uses deep learning. It uses the codegen command to generate a MEX function that performs prediction on a SeriesNetwork object called LogoNet.

Third-Party Prerequisites

Required

This example generates CUDA MEX and has the following third-party requirements.

  • CUDA® enabled NVIDIA® GPU and compatible driver.

Optional

For non-MEX builds such as static, dynamic libraries or executables, this example has the following additional requirements.

Verify GPU Environment

Use the coder.checkGpuInstall function to verify that the compilers and libraries necessary for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('host');
envCfg.DeepLibTarget = 'cudnn';
envCfg.DeepCodegen = 1;
envCfg.Quiet = 1;
coder.checkGpuInstall(envCfg);

The Logo Recognition Network

Logos assist users in brand identification and recognition. Many companies incorporate their logos in advertising, documentation materials, and promotions. The logo recognition network (logonet) was developed in MATLAB® and can recognize 32 logos under various lighting conditions and camera motions. Because this network focuses only on recognition, you can use it in applications where localization is not required.

Training the Network

The network is trained in MATLAB by using training data that contains around 200 images for each logo. Because the number of images for training the network is small, data augmentation increases the number of training samples. Four types of data augmentation are used: contrast normalization, Gaussian blur, random flipping, and shearing. This data augmentation helps in recognizing logos in images captured by different lighting conditions and camera motions. The input size for logonet is [227 227 3]. Standard SGDM trains by using a learning rate of 0.0001 for 40 epochs with a mini-batch size of 45. The trainLogonet.m helper script demonstrates the data augmentation on a sample image, architecture of the logonet, and training options.

Get Pretrained SeriesNetwork

Download the logonet network and save it to LogoNet.mat.

getLogonet();

The saved network contains 22 layers including convolution, fully connected, and the classification output layers.

load('LogoNet.mat');
convnet.Layers
ans = 

  22×1 Layer array with layers:

     1   'imageinput'    Image Input             227×227×3 images with 'zerocenter' normalization and 'randfliplr' augmentations
     2   'conv_1'        Convolution             96 5×5×3 convolutions with stride [1  1] and padding [0  0  0  0]
     3   'relu_1'        ReLU                    ReLU
     4   'maxpool_1'     Max Pooling             3×3 max pooling with stride [2  2] and padding [0  0  0  0]
     5   'conv_2'        Convolution             128 3×3×96 convolutions with stride [1  1] and padding [0  0  0  0]
     6   'relu_2'        ReLU                    ReLU
     7   'maxpool_2'     Max Pooling             3×3 max pooling with stride [2  2] and padding [0  0  0  0]
     8   'conv_3'        Convolution             384 3×3×128 convolutions with stride [1  1] and padding [0  0  0  0]
     9   'relu_3'        ReLU                    ReLU
    10   'maxpool_3'     Max Pooling             3×3 max pooling with stride [2  2] and padding [0  0  0  0]
    11   'conv_4'        Convolution             128 3×3×384 convolutions with stride [2  2] and padding [0  0  0  0]
    12   'relu_4'        ReLU                    ReLU
    13   'maxpool_4'     Max Pooling             3×3 max pooling with stride [2  2] and padding [0  0  0  0]
    14   'fc_1'          Fully Connected         2048 fully connected layer
    15   'relu_5'        ReLU                    ReLU
    16   'dropout_1'     Dropout                 50% dropout
    17   'fc_2'          Fully Connected         2048 fully connected layer
    18   'relu_6'        ReLU                    ReLU
    19   'dropout_2'     Dropout                 50% dropout
    20   'fc_3'          Fully Connected         32 fully connected layer
    21   'softmax'       Softmax                 softmax
    22   'classoutput'   Classification Output   crossentropyex with 'adidas' and 31 other classes

The logonet_predict Entry-Point Function

The logonet_predict.m entry-point function takes an image input and performs prediction on the image using the deep learning network saved in the LogoNet.mat file. The function loads the network object from LogoNet.mat into a persistent variable logonet and reuses the persistent variable on subsequent prediction calls.

type('logonet_predict.m')
function out = logonet_predict(in)
%#codegen

% Copyright 2017-2019 The MathWorks, Inc.

persistent logonet;

if isempty(logonet)
    
    logonet = coder.loadDeepLearningNetwork('LogoNet.mat','logonet');
end

out = logonet.predict(in);

end

Generate CUDA MEX for the logonet_predict Function

Create a GPU configuration object for a MEX target and set the target language to C++. Use the coder.DeepLearningConfig function to create a CuDNN deep learning configuration object and assign it to the DeepLearningConfig property of the GPU code configuration object. To generate CUDA MEX, use the codegen command and specify the input to be of size [227,227,3]. This value corresponds to the input layer size of the logonet network.

cfg = coder.gpuConfig('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
codegen -config cfg logonet_predict -args {ones(227,227,3,'uint8')} -report
Code generation successful: To view the report, open('codegen/mex/logonet_predict/html/report.mldatx').

Run Generated MEX

Load an input image. Call logonet_predict_mex on the input image.

im = imread('test.png');
imshow(im);
im = imresize(im, [227,227]);
predict_scores = logonet_predict_mex(im);

Map the top five prediction scores to words in the Wordnet dictionary synset (logos).

synsetOut = {'adidas', 'aldi', 'apple', 'becks', 'bmw', 'carlsberg', ...
    'chimay', 'cocacola', 'corona', 'dhl', 'erdinger', 'esso', 'fedex',...
    'ferrari', 'ford', 'fosters', 'google', 'guinness', 'heineken', 'hp',...
    'milka', 'nvidia', 'paulaner', 'pepsi', 'rittersport', 'shell', 'singha', 'starbucks', 'stellaartois', 'texaco', 'tsingtao', 'ups'};

[val,indx] = sort(predict_scores, 'descend');
scores = val(1:5)*100;
top5labels = synsetOut(indx(1:5));

Display the top five classification labels.

outputImage = zeros(227,400,3, 'uint8');
for k = 1:3
    outputImage(:,174:end,k) = im(:,:,k);
end

scol = 1;
srow = 20;

for k = 1:5
    outputImage = insertText(outputImage, [scol, srow], [top5labels{k},' ',num2str(scores(k), '%2.2f'),'%'], 'TextColor', 'w','FontSize',15, 'BoxColor', 'black');
    srow = srow + 20;
end

 imshow(outputImage);

Clear the static network object that was loaded in memory.

clear mex;

See Also

Functions

Objects

Related Topics