detect

Detect objects using ACF object detector

Description

bboxes = detect(detector,I) detects objects within image I using the input aggregate channel features (ACF) object detector. The locations of objects detected are returned as a set of bounding boxes.

example

[bboxes,scores] = detect(detector,I) also returns the detection scores for each bounding box.

[___]= detect(detector,I,roi) detects objects within the rectangular search region specified by roi, using either of the preceding syntaxes.

[___] = detect(___,Name,Value) specifies options using one or more Name,Value pair arguments. For example, detect(detector,I,'WindowStride',2) sets the stride of the sliding window used to detects objects to 2.

Examples

collapse all

Use the trainACFObjectDetector with training images to create an ACF object detector that can detect stop signs. Test the detector with a separate image.

Load the training data.

load('stopSignsAndCars.mat')

Select the ground truth for stop signs. These ground truth is the set of known locations of stop signs in the images.

stopSigns = stopSignsAndCars(:,1:2);

Add the full path to the image files.

stopSigns.imageFilename = fullfile(toolboxdir('vision'),...
    'visiondata',stopSigns.imageFilename);

Train the ACF detector. You can turn off the training progress output by specifying 'Verbose',false as a Name,Value pair.

acfDetector = trainACFObjectDetector(stopSigns,'NegativeSamplesFactor',2);
ACF Object Detector Training
The training will take 4 stages. The model size is 34x31.
Sample positive examples(~100% Completed)
Compute approximation coefficients...Completed.
Compute aggregated channel features...Completed.
--------------------------------------------
Stage 1:
Sample negative examples(~100% Completed)
Compute aggregated channel features...Completed.
Train classifier with 42 positive examples and 84 negative examples...Completed.
The trained classifier has 19 weak learners.
--------------------------------------------
Stage 2:
Sample negative examples(~100% Completed)
Found 84 new negative examples for training.
Compute aggregated channel features...Completed.
Train classifier with 42 positive examples and 84 negative examples...Completed.
The trained classifier has 20 weak learners.
--------------------------------------------
Stage 3:
Sample negative examples(~100% Completed)
Found 84 new negative examples for training.
Compute aggregated channel features...Completed.
Train classifier with 42 positive examples and 84 negative examples...Completed.
The trained classifier has 54 weak learners.
--------------------------------------------
Stage 4:
Sample negative examples(~100% Completed)
Found 84 new negative examples for training.
Compute aggregated channel features...Completed.
Train classifier with 42 positive examples and 84 negative examples...Completed.
The trained classifier has 61 weak learners.
--------------------------------------------
ACF object detector training is completed. Elapsed time is 19.2848 seconds.

Test the ACF detector on a test image.

img = imread('stopSignTest.jpg');

[bboxes,scores] = detect(acfDetector,img);

Display the detection results and insert the bounding boxes for objects into the image.

for i = 1:length(scores)
   annotation = sprintf('Confidence = %.1f',scores(i));
   img = insertObjectAnnotation(img,'rectangle',bboxes(i,:),annotation);
end

figure
imshow(img)

Input Arguments

collapse all

ACF object detector, specified as an acfObjectDetector object. To create this object, call the trainACFObjectDetector function with training data as input.

Input image, specified as a real, nonsparse, grayscale or RGB image.

Data Types: uint8 | uint16 | int16 | double | single | logical

Search region of interest, specified as an [x y width height] vector. The vector specifies the upper left corner and size of a region in pixels.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'NumScaleLevels',4

Number of scale levels per octave, specified as the comma-separated pair consisting of 'NumScaleLevels' and a positive integer. Each octave is a power-of-two downscaling of the image. To detect people at finer scale increments, increase this number. Recommended values are in the range [4, 8].

Stride for the sliding window, specified as the comma-separated pair consisting of 'WindowStride' and a positive integer. This value indicates the distance for the function to move the window in both the x and y directions. The sliding window scans the images for object detection.

Select the strongest bounding box for each detected object, specified as the comma-separated pair consisting of 'SelectStrongest' and either true or false.

  • true — Return the strongest bounding box per object. To select these boxes, detect calls the selectStrongestBbox function, which uses nonmaximal suppression to eliminate overlapping bounding boxes based on their confidence scores.

  • false — Return all detected bounding boxes. You can then create your own custom operation to eliminate overlapping bounding boxes.

Minimum region size that contains a detected object, specified as the comma-separated pair consisting of 'MinSize' and a [height width] vector. Units are in pixels.

By default, MinSize is the smallest object that the trained detector can detect.

Maximum region size that contains a detected object, specified as the comma-separated pair consisting of 'MaxSize' and a [height width] vector. Units are in pixels.

To reduce computation time, set this value to the known maximum region size for the objects being detected in the image. By default, 'MaxSize' is set to the height and width of the input image, I.

Classification accuracy threshold, specified as the comma-separated pair consisting of 'Threshold' and a numeric scalar. Recommended values are in the range [–1, 1]. During multiscale object detection, the threshold value controls the accuracy and speed for classifying image subregions as either objects or nonobjects. To speed up the performance at the risk of missing true detections, increase this threshold.

Output Arguments

collapse all

Location of objects detected within the input image, returned as an M-by-4 matrix, where M is the number of bounding boxes. Each row of bboxes contains a four-element vector of the form [x y width height]. This vector specifies the upper left corner and size of that corresponding bounding box in pixels.

Detection confidence scores, returned as an M-by-1 vector, where M is the number of bounding boxes. A higher score indicates higher confidence in the detection.

Extended Capabilities

Introduced in R2017a