Train a Faster R-CNN deep learning object detector
trains a Faster R-CNN (regions with convolution neural
networks) object detector using deep learning. You can train
a Faster R-CNN detector to detect multiple object
classes.trainedDetector
= trainFasterRCNNObjectDetector(trainingData
,network
,options
)
This function requires that you have Deep Learning Toolbox™. It is recommended that you also have Parallel Computing Toolbox™ to use with a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher.
[
also returns information on the training progress, such as
training loss and accuracy, for each iteration.trainedDetector
,info
] = trainFasterRCNNObjectDetector(___)
resumes training from a detector checkpoint.trainedDetector
= trainFasterRCNNObjectDetector(trainingData
,checkpoint
,options
)
continues training a Faster R-CNN object detector with
additional fine-tuning options. Use this syntax with
additional training data or to perform more training
iterations to improve detector accuracy.trainedDetector
= trainFasterRCNNObjectDetector(trainingData
,detector
,options
)
uses additional options specified by one or more
trainedDetector
= trainFasterRCNNObjectDetector(___,Name,Value
)Name,Value
pair arguments and
any of the previous inputs.
To accelerate data preprocessing for training,
trainFastRCNNObjectDetector
automatically creates and uses a parallel pool based on your
parallel preference settings. For more details about setting
these preferences, see parallel preference settings. Using parallel
computing preferences requires Parallel Computing Toolbox.
VGG-16, VGG-19, ResNet-101, and Inception-ResNet-v2 are large models. Training with large images can produce "out-of-memory" errors. To mitigate these errors, try one or more of these options:
Reduce the size of your images by using the
'SmallestImageDimension
'
argument.
Decrease the value of the
'NumRegionsToSample
'
name-value argument.
This function supports transfer learning. When you input a
network
by name, such as
'resnet50'
, then the function
automatically transforms the network into a valid Faster
R-CNN network model based on the pretrained resnet50
(Deep Learning Toolbox) model. Alternatively, manually
specify a custom Faster R-CNN network by using the LayerGraph
(Deep Learning Toolbox) extracted from a pretrained DAG
network. For more details, see Create Faster R-CNN Object Detection Network.
This table describes how to transform each named network into a Faster R-CNN network. The feature extraction layer name specifies the layer for processing by the ROI pooling layer. The ROI output size specifies the size of the feature maps output by the ROI pooling layer.
Network Name | Feature Extraction Layer Name | ROI Pooling Layer OutputSize | Description |
---|---|---|---|
alexnet (Deep Learning Toolbox) | 'relu5' | [6 6] | Last max pooling layer is replaced by ROI max pooling layer |
vgg16 (Deep Learning Toolbox) | 'relu5_3' | [7 7] | |
vgg19 (Deep Learning Toolbox) | 'relu5_4' | ||
squeezenet (Deep Learning Toolbox) | 'fire5-concat' | [14 14] | |
resnet18 (Deep Learning Toolbox) | 'res4b_relu' | ROI pooling layer is inserted after the feature extraction layer. | |
resnet50 (Deep Learning Toolbox) | 'activation_40_relu' | ||
resnet101 (Deep Learning Toolbox) | 'res4b22_relu' | ||
googlenet (Deep Learning Toolbox) | 'inception_4d-output' | ||
mobilenetv2 (Deep Learning Toolbox) | 'block_13_expand_relu' | ||
inceptionv3 (Deep Learning Toolbox) | 'mixed7' | [17 17] | |
inceptionresnetv2 (Deep Learning Toolbox) | 'block17_20_ac' |
For information on modifying how a network is transformed into a Faster R-CNN network, see Design an R-CNN, Fast R-CNN, and a Faster R-CNN Model.
During training, multiple image regions are processed from the
training images The number of image regions per image is
controlled by the NumRegionsToSample
property. The PositiveOverlapRange
and
NegativeOverlapRange
properties
control which image regions are used for training. Positive
training samples are those that overlap with the ground
truth boxes by 0.6 to 1.0, as measured by the bounding box
intersection-over-union metric (IoU). Negative training
samples are those that overlap by 0 to 0.3. Choose values
for these properties by testing the trained detector on a
validation set.
Overlap Values | Description |
---|---|
PositiveOverlapRange
set to [0.6 1] | Positive training samples are set equal to the samples that overlap with the ground truth boxes by 0.6 to 1.0, measured by the bounding box IoU metric. |
NegativeOverlapRange
set to [0 0.3] | Negative training samples are set equal to the samples that overlap with the ground truth boxes by 0 to 0.3. |
If you set
PositiveOverlapRange
to
[0.6 1]
, then the function sets
the positive training samples equal to the samples that
overlap with the ground truth boxes by 0.6 to 1.0, measured
by the bounding box IoU metric. If you set
NegativeOverlapRange
to
[0 0.3]
, then the function sets
the negative training samples equal to the samples that
overlap with the ground truth boxes by 0 to 0.3.
Use the trainingOptions
(Deep Learning Toolbox) function to enable or
disable verbose printing.
[1] Ren, S., K. He, R. Girschick, and J. Sun. "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks." Advances in Neural Information Processing Systems. Vol. 28, 2015.
[2] Girshick, R. "Fast R-CNN." Proceedings of the IEEE International Conference on Computer Vision, 1440-1448. Santiago, Chile: IEEE, 2015.
[3] Girshick, R., J. Donahue, T. Darrell, and J. Malik. "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation." Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 580-587. Columbus, OH: IEEE, 2014.
[4] Zitnick, C. L., and P. Dollar. "Edge Boxes: Locating Object Proposals from Edges." Computer Vision-ECCV 2014, 391-405. Zurich, Switzerland: ECCV, 2014.
estimateAnchorBoxes
| fasterRCNNLayers
| objectDetectorTrainingData
| trainFastRCNNObjectDetector
| trainRCNNObjectDetector
| trainingOptions
(Deep Learning Toolbox)boxLabelDatastore
| fasterRCNNObjectDetector
| averagePooling2dLayer
(Deep Learning Toolbox) | Layer
(Deep Learning Toolbox) | layerGraph
(Deep Learning Toolbox) | maxPooling2dLayer
(Deep Learning Toolbox) | SeriesNetwork
(Deep Learning Toolbox)