trainCascadeObjectDetector

Train cascade object detector model

Description

example

trainCascadeObjectDetector(outputXMLFilename,positiveInstances,negativeImages) writes a trained cascade detector XML file named, outputXMLFilename. The file name must include an XML extension. For a more detailed explanation on how this function works, refer to Train a Cascade Object Detector.

trainCascadeObjectDetector(outputXMLFilename,'resume') resumes an interrupted training session. The outputXMLFilename input must match the output file name from the interrupted session. All arguments saved from the earlier session are reused automatically.

example

trainCascadeObjectDetector(___,Name,Value) uses additional options specified by one or more Name,Value pair arguments.

Examples

collapse all

Load the positive samples data from a MAT file. The file contains a table specifying bounding boxes for several object categories. The table was exported from the Training Image Labeler app.

Load positive samples.

load('stopSignsAndCars.mat');

Select the bounding boxes for stop signs from the table.

positiveInstances = stopSignsAndCars(:,1:2);

Add the image folder to the MATLAB path.

imDir = fullfile(matlabroot,'toolbox','vision','visiondata',...
    'stopSignImages');
addpath(imDir);

Specify the folder for negative images.

negativeFolder = fullfile(matlabroot,'toolbox','vision','visiondata',...
    'nonStopSigns');

Create an imageDatastore object containing negative images.

negativeImages = imageDatastore(negativeFolder);

Train a cascade object detector called 'stopSignDetector.xml' using HOG features. NOTE: The command can take several minutes to run.

trainCascadeObjectDetector('stopSignDetector.xml',positiveInstances, ...
    negativeFolder,'FalseAlarmRate',0.1,'NumCascadeStages',5);
Automatically setting ObjectTrainingSize to [35, 32]
Using at most 42 of 42 positive samples per stage
Using at most 84 negative samples per stage

--cascadeParams--
Training stage 1 of 5
[........................................................................]
Used 42 positive and 84 negative samples
Time to train stage 1: 0 seconds

Training stage 2 of 5
[........................................................................]
Used 42 positive and 84 negative samples
Time to train stage 2: 0 seconds

Training stage 3 of 5
[........................................................................]
Used 42 positive and 84 negative samples
Time to train stage 3: 2 seconds

Training stage 4 of 5
[........................................................................]
Used 42 positive and 84 negative samples
Time to train stage 4: 5 seconds

Training stage 5 of 5
[........................................................................]
Used 42 positive and 17 negative samples
Time to train stage 5: 9 seconds

Training complete

Use the newly trained classifier to detect a stop sign in an image.

detector = vision.CascadeObjectDetector('stopSignDetector.xml');

Read the test image.

img = imread('stopSignTest.jpg');

Detect a stop sign.

bbox = step(detector,img);

Insert bounding box rectangles and return the marked image.

 detectedImg = insertObjectAnnotation(img,'rectangle',bbox,'stop sign');

Display the detected stop sign.

figure; imshow(detectedImg);

Remove the image directory from the path.

rmpath(imDir);

Input Arguments

collapse all

Positive samples, specified as a two-column table or two-field structure.

The first table column or structure field contains image file names, specified as character vectors or string scalars. Each image can be true color, grayscale, or indexed, in any of the formats supported by imread.

The second table column or structure field contains an M-by-4 matrix of M bounding boxes. Each bounding box is in the format [x y width height] and specifies an object location in the corresponding image.

You can use the Image Labeler or Video Labeler app to label objects of interest with bounding boxes. The app returns a groundTruth object. Use the objectDetectorTrainingData function to obtain a table from the object to use for positiveInstances. The function automatically determines the number of positive samples to use at each of the cascade stages. This value is based on the number of stages and the true positive rate. The true positive rate specifies how many positive samples can be misclassified.

Data Types: table | struct

Negative images, specified as an ImageDatastore object, a path to a folder containing images, or as a cell array of image file names. Because the images are used to generate negative samples, they must not contain any objects of interest. Instead, they should contain backgrounds associated with the object.

Trained cascade detector file name, specified as a character vector or a string scalar with an XML extension. For example, 'stopSignDetector.xml'.

Data Types: char

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'FeatureType','Haar' specifies Haar for the type of features to use.

Training object size, specified as the comma-separated pair. This pair contains 'ObjectTrainingSize' and either a two-element [height, width] vector, or as 'Auto'. Before training, the function resizes the positive and negative samples to ObjectTrainingSize in pixels. If you select 'Auto', the function determines the size automatically based on the median width-to-height ratio of the positive instances. For optimal detection accuracy, specify an object training size close to the expected size of the object in the image. However, for faster training and detection, set the object training size to be smaller than the expected size of the object in the image.

Data Types: char | single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Negative sample factor, specified as the comma-separated pair consisting of 'NegativeSamplesFactor' and a real-valued scalar. The number of negative samples to use at each stage is equal to

NegativeSamplesFactor × [the number of positive samples used at each stage].

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Number of cascade stages to train, specified as the comma-separated pair consisting of 'NumCascadeStages' and a positive integer. Increasing the number of stages may result in a more accurate detector but also increases training time. More stages can require more training images, because at each stage, some number of positive and negative samples are eliminated. This value depends on the values of FalseAlarmRate and TruePositiveRate. More stages can also enable you to increase the FalseAlarmRate. See the Train a Cascade Object Detector tutorial for more details.

Data Types: single | double | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64

Acceptable false alarm rate at each stage, specified as the comma-separated pair consisting of 'FalseAlarmRate' and a value in the range (0 1]. The false alarm rate is the fraction of negative training samples incorrectly classified as positive samples.

The overall false alarm rate is calculated using the FalseAlarmRate per stage and the number of cascade stages, NumCascadeStages:

FalseAlarmRateNumCascadeStages

Lower values for FalseAlarmRate increase complexity of each stage. Increased complexity can achieve fewer false detections but can result in longer training and detection times. Higher values for FalseAlarmRate can require a greater number of cascade stages to achieve reasonable detection accuracy.

Data Types: single | double

Minimum true positive rate required at each stage, specified as the comma-separated pair consisting of 'TruePositiveRate' and a value in the range (0 1]. The true positive rate is the fraction of correctly classified positive training samples.

The overall resulting target positive rate is calculated using the TruePositiveRate per stage and the number of cascade stages, NumCascadeStages:

TruePositiveRateNumCascadeStages

Higher values for TruePositiveRate increase complexity of each stage. Increased complexity can achieve a greater number of correct detections but can result in longer training and detection times.

Data Types: single | double

Feature type, specified as the comma-separated pair consisting of 'FeatureType' and one of the following:

'Haar'[1] — Haar-like features
'LBP'[2] — Local binary patterns
'HOG'[3] — Histogram of oriented gradients

The function allocates a large amount of memory, especially the Haar features. To avoid running out of memory, use this function on a 64-bit operating system with a sufficient amount of RAM.

Data Types: char

Tips

  • Training a good detector requires thousands of training samples. Processing time for a large amount of data varies, but it is likely to take hours or even days. During training, the function displays the time it took to train each stage in the MATLAB® command window.

  • The OpenCV HOG parameters used in this function are:

    • Numbins: 9

    • CellSize = [8 8]

    • BlockSize = [4 4]

    • BlockOverlap = [2 2]

    • UseSignedOrientation = false

References

[1] Viola, P., and M. J. Jones. "Rapid Object Detection using a Boosted Cascade of Simple Features." Proceedings of the 2001 IEEE Computer Society Conference. Volume 1, 15 April 2001, pp. I-511–I-518.

[2] Ojala, T., M. Pietikainen, and T. Maenpaa. “Multiresolution Gray-scale and Rotation Invariant Texture Classification With Local Binary Patterns.” IEEE Transactions on Pattern Analysis and Machine Intelligence. Volume 24, No. 7 July 2002, pp. 971–987.

[3] Dalal, N., and B. Triggs. “Histograms of Oriented Gradients for Human Detection.” IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Volume 1, 2005, pp. 886–893.

Introduced in R2013a