Create training data for an object detector
[
creates an image datastore and a box label datastore training data from the
specified ground truth.imds
,blds
] = objectDetectorTrainingData(gTruth
)
You can combine the image and box label datastores using combine
(imds
,blds
) to
create a datastore needed for training. Use the combined datastore with the
training functions, such as trainACFObjectDetector
,
trainYOLOv2ObjectDetector
, trainFastRCNNObjectDetector
,
trainFasterRCNNObjectDetector
,
and trainRCNNObjectDetector
.
This function supports parallel computing using multiple MATLAB® workers. Enable parallel computing using the Computer Vision Toolbox Preferences dialog.
returns a table of training data from the specified ground truth.
trainingDataTable
= objectDetectorTrainingData(gTruth
)gTruth
is an array of groundTruth
objects. You can use
the table to train an object detector using the Computer Vision Toolbox™ training functions.
___ = objectDetectorTrainingData(
returns a table of training data with additional options specified by one or
more name-value pair arguments. If you create the gTruth
,Name,Value
)groundTruth
objects in
gTruth
using a video file, a custom data source, or an
imageDatastore
object with
different custom read functions, then you can specify any combination of
name-value pair arguments. If you create the groundTruth
objects from an image collection or image sequence data source, then you can
specify only the 'SamplingFactor
' name-value pair
argument.
Train a vehicle detector based on a YOLO v2 network.
Add the folder containing images to the workspace.
imageDir = fullfile(matlabroot,'toolbox','vision','visiondata','vehicles'); addpath(imageDir);
Load the vehicle ground truth data.
data = load('vehicleTrainingGroundTruth.mat');
gTruth = data.vehicleTrainingGroundTruth;
Load the detector containing the layerGraph object for training.
vehicleDetector = load('yolov2VehicleDetector.mat');
lgraph = vehicleDetector.lgraph
lgraph = LayerGraph with properties: Layers: [25×1 nnet.cnn.layer.Layer] Connections: [24×2 table] InputNames: {'input'} OutputNames: {'yolov2OutputLayer'}
Create an image datastore and box label datastore using the ground truth object.
[imds,bxds] = objectDetectorTrainingData(gTruth);
Combine the datastores.
cds = combine(imds,bxds);
Configure training options.
options = trainingOptions('sgdm', ... 'InitialLearnRate', 0.001, ... 'Verbose',true, ... 'MiniBatchSize',16, ... 'MaxEpochs',30, ... 'Shuffle','every-epoch', ... 'VerboseFrequency',10);
Train the detector.
[detector,info] = trainYOLOv2ObjectDetector(cds,lgraph,options);
************************************************************************* Training a YOLO v2 Object Detector for the following object classes: * vehicle Training on single CPU. |========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning | | | | (hh:mm:ss) | RMSE | Loss | Rate | |========================================================================================| | 1 | 1 | 00:00:00 | 7.50 | 56.2 | 0.0010 | | 1 | 10 | 00:00:02 | 1.73 | 3.0 | 0.0010 | | 2 | 20 | 00:00:04 | 1.58 | 2.5 | 0.0010 | | 2 | 30 | 00:00:06 | 1.36 | 1.9 | 0.0010 | | 3 | 40 | 00:00:08 | 1.13 | 1.3 | 0.0010 | | 3 | 50 | 00:00:09 | 1.01 | 1.0 | 0.0010 | | 4 | 60 | 00:00:11 | 0.95 | 0.9 | 0.0010 | | 4 | 70 | 00:00:13 | 0.84 | 0.7 | 0.0010 | | 5 | 80 | 00:00:15 | 0.84 | 0.7 | 0.0010 | | 5 | 90 | 00:00:17 | 0.70 | 0.5 | 0.0010 | | 6 | 100 | 00:00:19 | 0.65 | 0.4 | 0.0010 | | 7 | 110 | 00:00:21 | 0.73 | 0.5 | 0.0010 | | 7 | 120 | 00:00:23 | 0.60 | 0.4 | 0.0010 | | 8 | 130 | 00:00:24 | 0.63 | 0.4 | 0.0010 | | 8 | 140 | 00:00:26 | 0.64 | 0.4 | 0.0010 | | 9 | 150 | 00:00:28 | 0.57 | 0.3 | 0.0010 | | 9 | 160 | 00:00:30 | 0.54 | 0.3 | 0.0010 | | 10 | 170 | 00:00:32 | 0.52 | 0.3 | 0.0010 | | 10 | 180 | 00:00:33 | 0.45 | 0.2 | 0.0010 | | 11 | 190 | 00:00:35 | 0.55 | 0.3 | 0.0010 | | 12 | 200 | 00:00:37 | 0.56 | 0.3 | 0.0010 | | 12 | 210 | 00:00:39 | 0.55 | 0.3 | 0.0010 | | 13 | 220 | 00:00:41 | 0.52 | 0.3 | 0.0010 | | 13 | 230 | 00:00:42 | 0.53 | 0.3 | 0.0010 | | 14 | 240 | 00:00:44 | 0.58 | 0.3 | 0.0010 | | 14 | 250 | 00:00:46 | 0.47 | 0.2 | 0.0010 | | 15 | 260 | 00:00:48 | 0.49 | 0.2 | 0.0010 | | 15 | 270 | 00:00:50 | 0.44 | 0.2 | 0.0010 | | 16 | 280 | 00:00:52 | 0.45 | 0.2 | 0.0010 | | 17 | 290 | 00:00:54 | 0.47 | 0.2 | 0.0010 | | 17 | 300 | 00:00:55 | 0.43 | 0.2 | 0.0010 | | 18 | 310 | 00:00:57 | 0.44 | 0.2 | 0.0010 | | 18 | 320 | 00:00:59 | 0.44 | 0.2 | 0.0010 | | 19 | 330 | 00:01:01 | 0.38 | 0.1 | 0.0010 | | 19 | 340 | 00:01:03 | 0.41 | 0.2 | 0.0010 | | 20 | 350 | 00:01:04 | 0.39 | 0.2 | 0.0010 | | 20 | 360 | 00:01:06 | 0.42 | 0.2 | 0.0010 | | 21 | 370 | 00:01:08 | 0.42 | 0.2 | 0.0010 | | 22 | 380 | 00:01:10 | 0.39 | 0.2 | 0.0010 | | 22 | 390 | 00:01:12 | 0.37 | 0.1 | 0.0010 | | 23 | 400 | 00:01:13 | 0.37 | 0.1 | 0.0010 | | 23 | 410 | 00:01:15 | 0.35 | 0.1 | 0.0010 | | 24 | 420 | 00:01:17 | 0.29 | 8.3e-02 | 0.0010 | | 24 | 430 | 00:01:19 | 0.36 | 0.1 | 0.0010 | | 25 | 440 | 00:01:21 | 0.28 | 7.9e-02 | 0.0010 | | 25 | 450 | 00:01:22 | 0.29 | 8.1e-02 | 0.0010 | | 26 | 460 | 00:01:24 | 0.28 | 8.0e-02 | 0.0010 | | 27 | 470 | 00:01:26 | 0.27 | 7.1e-02 | 0.0010 | | 27 | 480 | 00:01:28 | 0.25 | 6.3e-02 | 0.0010 | | 28 | 490 | 00:01:30 | 0.24 | 5.9e-02 | 0.0010 | | 28 | 500 | 00:01:31 | 0.29 | 8.4e-02 | 0.0010 | | 29 | 510 | 00:01:33 | 0.35 | 0.1 | 0.0010 | | 29 | 520 | 00:01:35 | 0.31 | 9.3e-02 | 0.0010 | | 30 | 530 | 00:01:37 | 0.18 | 3.1e-02 | 0.0010 | | 30 | 540 | 00:01:38 | 0.22 | 4.6e-02 | 0.0010 | |========================================================================================| Detector training complete. *************************************************************************
Read a test image.
I = imread('detectcars.png');
Run the detector.
[bboxes,scores] = detect(detector,I);
Display the results.
if(~isempty(bboxes)) I = insertObjectAnnotation(I,'rectangle',bboxes,scores); end figure imshow(I)
Use training data to train an ACF-based object detector for stop signs
Add the folder containing images to the MATLAB path.
imageDir = fullfile(matlabroot, 'toolbox', 'vision', 'visiondata', 'stopSignImages'); addpath(imageDir);
Load ground truth data, which contains data for stops signs and cars.
load('stopSignsAndCarsGroundTruth.mat','stopSignsAndCarsGroundTruth')
View the label definitions to see the label types in the ground truth.
stopSignsAndCarsGroundTruth.LabelDefinitions
Select the stop sign data for training.
stopSignGroundTruth = selectLabels(stopSignsAndCarsGroundTruth,'stopSign');
Create the training data for a stop sign object detector.
trainingData = objectDetectorTrainingData(stopSignGroundTruth); summary(trainingData)
Variables: imageFilename: 41×1 cell array of character vectors stopSign: 41×1 cell
Train an ACF-based object detector.
acfDetector = trainACFObjectDetector(trainingData,'NegativeSamplesFactor',2);
ACF Object Detector Training The training will take 4 stages. The model size is 34x31. Sample positive examples(~100% Completed) Compute approximation coefficients...Completed. Compute aggregated channel features...Completed. -------------------------------------------- Stage 1: Sample negative examples(~100% Completed) Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 19 weak learners. -------------------------------------------- Stage 2: Sample negative examples(~100% Completed) Found 84 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 20 weak learners. -------------------------------------------- Stage 3: Sample negative examples(~100% Completed) Found 84 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 54 weak learners. -------------------------------------------- Stage 4: Sample negative examples(~100% Completed) Found 84 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 42 positive examples and 84 negative examples...Completed. The trained classifier has 61 weak learners. -------------------------------------------- ACF object detector training is completed. Elapsed time is 30.3579 seconds.
Test the ACF-based detector on a sample image.
I = imread('stopSignTest.jpg');
bboxes = detect(acfDetector,I);
Display the detected object.
annotation = acfDetector.ModelName;
I = insertObjectAnnotation(I,'rectangle',bboxes,annotation);
figure
imshow(I)
Remove the image folder from the path.
rmpath(imageDir);
Use training data to train an ACF-based object detector for vehicles.
imageDir = fullfile(matlabroot,'toolbox','driving','drivingdata','vehiclesSequence'); addpath(imageDir);
Load the ground truth data.
load vehicleGroundTruth.mat
Create the training data for an object detector for vehicles
trainingData = objectDetectorTrainingData(gTruth,'SamplingFactor',2);
Train the ACF-based object detector.
acfDetector = trainACFObjectDetector(trainingData,'ObjectTrainingSize',[20 20]);
ACF Object Detector Training The training will take 4 stages. The model size is 20x20. Sample positive examples(~100% Completed) Compute approximation coefficients...Completed. Compute aggregated channel features...Completed. -------------------------------------------- Stage 1: Sample negative examples(~100% Completed) Compute aggregated channel features...Completed. Train classifier with 71 positive examples and 355 negative examples...Completed. The trained classifier has 68 weak learners. -------------------------------------------- Stage 2: Sample negative examples(~100% Completed) Found 76 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 71 positive examples and 355 negative examples...Completed. The trained classifier has 120 weak learners. -------------------------------------------- Stage 3: Sample negative examples(~100% Completed) Found 54 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 71 positive examples and 355 negative examples...Completed. The trained classifier has 170 weak learners. -------------------------------------------- Stage 4: Sample negative examples(~100% Completed) Found 63 new negative examples for training. Compute aggregated channel features...Completed. Train classifier with 71 positive examples and 355 negative examples...Completed. The trained classifier has 215 weak learners. -------------------------------------------- ACF object detector training is completed. Elapsed time is 28.4547 seconds.
Test the ACF detector on a test image.
I = imread('highway.png'); [bboxes, scores] = detect(acfDetector,I,'Threshold',1);
Select the detection with the highest classification score.
[~,idx] = max(scores);
Display the detected object.
annotation = acfDetector.ModelName;
I = insertObjectAnnotation(I,'rectangle',bboxes(idx,:),annotation);
figure
imshow(I)
Remove the image folder from the path.
rmpath(imageDir);
gTruth
— Ground truth datagroundTruth
objectsGround truth data, specified as a scalar or an array of groundTruth
objects. You can
create ground truth objects from existing ground truth data by using the
groundTruth
object.
If you use custom data sources in groundTruth
with parallel computing enabled, then the reader
function is expected to work with a pool of MATLAB workers to read images from the data source in
parallel.
Specify optional
comma-separated pairs of Name,Value
arguments. Name
is
the argument name and Value
is the corresponding value.
Name
must appear inside quotes. You can specify several name and value
pair arguments in any order as
Name1,Value1,...,NameN,ValueN
.
'SamplingFactor'
,5
'SamplingFactor'
— Factor for subsampling images'auto'
(default) | integer | vector of integersFactor for subsampling images in the ground truth data source,
specified as 'auto'
, an integer, or a vector of
integers. For a sampling factor of N, the returned
training data includes every Nth image in the ground
truth data source. The function ignores ground truth images with empty
label data.
Value | Sampling Factor |
---|---|
'auto' | The sampling factor N is
5 for data sources with
timestamps, and 1 for a
collection of images. |
Integer | All ground truth data sources in
gTruth are sampled with the
same sampling factor N. |
Vector of integers | The kth ground truth data
source in gTruth is sampled
with a sampling factor of
N(k). |
'WriteLocation'
— Name of folderpwd
(current working
folder) (default) | string scalar | character vectorFolder name to write extracted images to, specified as a string scalar or character vector. The specified folder must exist and have write permissions.
This argument applies only for:
groundTruth
objects created using a video file or a custom data
source.
An array of groundTruth
objects created using imageDatastore
, with different custom
read
functions.
The function ignores this argument when:
The input groundTruth
object was created from an image sequence data
source.
The array of input groundTruth
objects all contain image datastores using the same custom
read
function.
Any of the input groundTruth
objects containing datastores, use the default
read
functions.
'ImageFormat'
— Image file formatPNG
(default) | string scalar | character vectorImage file format, specified as a string scalar or character vector. File formats must be
supported by imwrite
.
This argument applies only for:
groundTruth
objects created using a video file or a custom data
source.
An array of groundTruth
objects created using imageDatastore
with different custom
read
functions.
The function ignores this argument when:
The input groundTruth
object was created from an image sequence data
source.
The array of input groundTruth
objects all contain image datastores using the same custom
read
function.
Any of the input groundTruth
objects containing datastores, use the default
read
functions.
'NamePrefix'
— Prefix for output image file namesPrefix for output image file names, specified as a string scalar or character vector. The image files are named as:
<name_prefix><source_number>_<image_number>.<image_format>
The default value uses the name of the data source that the images
were extracted from, strcat(sourceName,'_')
, for
video and a custom data source, or 'datastore'
, for
an image datastore.
This argument applies only for:
groundTruth
objects created using a video file or a custom data
source.
An array of groundTruth
objects created using imageDatastore
with different custom
read
functions.
The function ignores this argument when:
The input groundTruth
object was created from an image sequence data
source.
The array of input groundTruth
objects all contain image datastores using the same custom
read
function.
Any of the input groundTruth
objects containing datastores, use the default
read
functions.
'Verbose'
— Flag to display training progresstrue
(default) | false
Flag to display training progress at the MATLAB command line,
specified as either true
or false
.
This property applies only for groundTruth
objects
created using a video file or a custom data source.
imds
— Image datastoreimageDatastore
objectImage datastore, returned as an imageDatastore
object
containing images extracted from the gTruth
objects.
The images in imds
contain at least one class of
annotated labels. The function ignores images that are not annotated.
blds
— Box label datastoreboxLabelDatastore
objectBox label datastore, returned as a boxLabelDatastore
object. The datastore contains categorical
vectors for ROI label names and M-by-4 matrices of
M bounding boxes. The locations and sizes of the
bounding boxes are represented as double M-by-4 element
vectors in the format
[x,y,width,height].
trainingDataTable
— Training data tableTraining data table, returned as a table with two or more columns. The
first column of the table contains image file names with paths. The images
can be grayscale or truecolor (RGB) and in any format supported by imread
. Each of the
remaining columns correspond to an ROI label and contains the locations of
bounding boxes in the image (specified in the first column), for that label.
The bounding boxes are specified as M-by-4 matrices of
M bounding boxes in the format
[x,y,width,height].
[x,y] specifies the upper-left
corner location. To create a ground truth table, you can use the Image
Labeler app or Video
Labeler app.
The output table ignores any sublabel or attribute data
present in the input gTruth
object.
estimateAnchorBoxes
| trainACFObjectDetector
| trainFasterRCNNObjectDetector
| trainFastRCNNObjectDetector
| trainRCNNObjectDetector
| trainYOLOv2ObjectDetector
You have a modified version of this example. Do you want to open this example with your edits?