Train YOLO v2 Network for Vehicle Detection

This example uses:

Load the training data for vehicle detection into the workspace.

data = load('vehicleTrainingData.mat');
trainingData = data.vehicleTrainingData;

Specify the directory in which training samples are stored. Add full path to the file names in training data.

dataDir = fullfile(toolboxdir('vision'),'visiondata');
trainingData.imageFilename = fullfile(dataDir,trainingData.imageFilename);

Randomly shuffle data for training.

rng(0);
shuffledIdx = randperm(height(trainingData));
trainingData = trainingData(shuffledIdx,:);

Create an imageDatastore using the files from the table.

imds = imageDatastore(trainingData.imageFilename);

Create a boxLabelDatastore using the label columns from the table.

blds = boxLabelDatastore(trainingData(:,2:end));

Combine the datastores.

ds = combine(imds, blds);

Load a preinitialized YOLO v2 object detection network.

net = load('yolov2VehicleDetector.mat');
lgraph = net.lgraph

lgraph = 
  LayerGraph with properties:

         Layers: [25×1 nnet.cnn.layer.Layer]
    Connections: [24×2 table]
     InputNames: {'input'}
    OutputNames: {'yolov2OutputLayer'}

Inspect the layers in the YOLO v2 network and their properties. You can also create the YOLO v2 network by following the steps given in Create YOLO v2 Object Detection Network.

lgraph.Layers

ans = 
  25x1 Layer array with layers:

     1   'input'               Image Input                128x128x3 images
     2   'conv_1'              Convolution                16 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
     3   'BN1'                 Batch Normalization        Batch normalization
     4   'relu_1'              ReLU                       ReLU
     5   'maxpool1'            Max Pooling                2x2 max pooling with stride [2  2] and padding [0  0  0  0]
     6   'conv_2'              Convolution                32 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
     7   'BN2'                 Batch Normalization        Batch normalization
     8   'relu_2'              ReLU                       ReLU
     9   'maxpool2'            Max Pooling                2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    10   'conv_3'              Convolution                64 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
    11   'BN3'                 Batch Normalization        Batch normalization
    12   'relu_3'              ReLU                       ReLU
    13   'maxpool3'            Max Pooling                2x2 max pooling with stride [2  2] and padding [0  0  0  0]
    14   'conv_4'              Convolution                128 3x3 convolutions with stride [1  1] and padding [1  1  1  1]
    15   'BN4'                 Batch Normalization        Batch normalization
    16   'relu_4'              ReLU                       ReLU
    17   'yolov2Conv1'         Convolution                128 3x3 convolutions with stride [1  1] and padding 'same'
    18   'yolov2Batch1'        Batch Normalization        Batch normalization
    19   'yolov2Relu1'         ReLU                       ReLU
    20   'yolov2Conv2'         Convolution                128 3x3 convolutions with stride [1  1] and padding 'same'
    21   'yolov2Batch2'        Batch Normalization        Batch normalization
    22   'yolov2Relu2'         ReLU                       ReLU
    23   'yolov2ClassConv'     Convolution                24 1x1 convolutions with stride [1  1] and padding [0  0  0  0]
    24   'yolov2Transform'     YOLO v2 Transform Layer.   YOLO v2 Transform Layer with 4 anchors.
    25   'yolov2OutputLayer'   YOLO v2 Output             YOLO v2 Output with 4 anchors.

Configure the network training options.

options = trainingOptions('sgdm',...
          'InitialLearnRate',0.001,...
          'Verbose',true,...
          'MiniBatchSize',16,...
          'MaxEpochs',30,...
          'Shuffle','never',...
          'VerboseFrequency',30,...
          'CheckpointPath',tempdir);

Train the YOLO v2 network.

[detector,info] = trainYOLOv2ObjectDetector(ds,lgraph,options);

*************************************************************************
Training a YOLO v2 Object Detector for the following object classes:

* vehicle

Training on single CPU.
|========================================================================================|
|  Epoch  |  Iteration  |  Time Elapsed  |  Mini-batch  |  Mini-batch  |  Base Learning  |
|         |             |   (hh:mm:ss)   |     RMSE     |     Loss     |      Rate       |
|========================================================================================|
|       1 |           1 |       00:00:01 |         7.13 |         50.8 |          0.0010 |
|       2 |          30 |       00:00:14 |         1.35 |          1.8 |          0.0010 |
|       4 |          60 |       00:00:27 |         1.13 |          1.3 |          0.0010 |
|       5 |          90 |       00:00:39 |         0.64 |          0.4 |          0.0010 |
|       7 |         120 |       00:00:51 |         0.65 |          0.4 |          0.0010 |
|       9 |         150 |       00:01:04 |         0.72 |          0.5 |          0.0010 |
|      10 |         180 |       00:01:16 |         0.52 |          0.3 |          0.0010 |
|      12 |         210 |       00:01:28 |         0.45 |          0.2 |          0.0010 |
|      14 |         240 |       00:01:41 |         0.61 |          0.4 |          0.0010 |
|      15 |         270 |       00:01:52 |         0.43 |          0.2 |          0.0010 |
|      17 |         300 |       00:02:05 |         0.42 |          0.2 |          0.0010 |
|      19 |         330 |       00:02:17 |         0.52 |          0.3 |          0.0010 |
|      20 |         360 |       00:02:29 |         0.43 |          0.2 |          0.0010 |
|      22 |         390 |       00:02:42 |         0.43 |          0.2 |          0.0010 |
|      24 |         420 |       00:02:54 |         0.59 |          0.4 |          0.0010 |
|      25 |         450 |       00:03:06 |         0.61 |          0.4 |          0.0010 |
|      27 |         480 |       00:03:18 |         0.65 |          0.4 |          0.0010 |
|      29 |         510 |       00:03:31 |         0.48 |          0.2 |          0.0010 |
|      30 |         540 |       00:03:42 |         0.34 |          0.1 |          0.0010 |
|========================================================================================|
Detector training complete.
*************************************************************************

Inspect the properties of the detector.

detector

detector = 
  yolov2ObjectDetector with properties:

            ModelName: 'vehicle'
              Network: [1×1 DAGNetwork]
    TrainingImageSize: [128 128]
          AnchorBoxes: [4×2 double]
           ClassNames: vehicle

You can verify the training accuracy by inspecting the training loss for each iteration.

figure
plot(info.TrainingLoss)
grid on
xlabel('Number of Iterations')
ylabel('Training Loss for Each Iteration')

Read a test image into the workspace.

img = imread('detectcars.png');

Run the trained YOLO v2 object detector on the test image for vehicle detection.

[bboxes,scores] = detect(detector,img);

Display the detection results.

if(~isempty(bboxes))
    img = insertObjectAnnotation(img,'rectangle',bboxes,scores);
end
figure
imshow(img)

Documentation

Train YOLO v2 Network for Vehicle Detection

Computer Vision Toolbox Documentation

Support