Train a Fast R-CNN deep learning object detector
trains a Fast R-CNN (regions with convolution neural
networks) object detector using deep learning. You can train
a Fast R-CNN detector to detect multiple object
classes.trainedDetector
= trainFastRCNNObjectDetector(trainingData
,network
,options
)
This function requires that you have Deep Learning Toolbox™. It is recommended that you also have Parallel Computing Toolbox™ to use with a CUDA®-enabled NVIDIA® GPU with compute capability 3.0 or higher.
[
also returns information on the training progress, such as
training loss and accuracy, for each iteration.trainedDetector
,info
] = trainFastRCNNObjectDetector(___)
resumes training from a detector checkpoint.trainedDetector
= trainFastRCNNObjectDetector(trainingData
,checkpoint
,options
)
continues training a detector with additional training data
or performs more training iterations to improve detector
accuracy.trainedDetector
= trainFastRCNNObjectDetector(trainingData
,detector
,options
)
optionally trains a custom region proposal function,
trainedDetector
= trainFastRCNNObjectDetector(___,'RegionProposalFcn',proposalFcn
)proposalFcn
, using any of the
previous inputs. If you do not specify a proposal function,
then the function uses a variation of the Edge Boxes[2] algorithm.
uses additional options specified by one or more
trainedDetector
= trainFastRCNNObjectDetector(___,Name,Value
)Name,Value
pair
arguments.
To accelerate data preprocessing for training,
trainFastRCNNObjectDetector
automatically creates and uses a parallel pool based on your
parallel preference settings. For more details about setting
these preferences, see parallel preference settings. Using
parallel computing preferences requires Parallel Computing Toolbox.
VGG-16, VGG-19, ResNet-101, and Inception-ResNet-v2 are large models. Training with large images can produce "Out of Memory" errors. To mitigate these errors, try one or more of these options:
Reduce the size of your images by using the
'SmallestImageDimension
'
argument.
Decrease the value of the
'NumRegionsToSample
'
name-value argument value.
This function supports transfer learning. When you input a
network
by name, such as
'resnet50'
, then the function
automatically transforms the network into a valid Fast R-CNN
network model based on the pretrained resnet50
(Deep Learning Toolbox) model. Alternatively, manually
specify a custom Fast R-CNN network by using the LayerGraph
(Deep Learning Toolbox) extracted from a pretrained DAG
network. For more details, see Create Fast R-CNN Object Detection Network.
This table describes how to transform each named network into a Fast R-CNN network. The feature extraction layer name specifies which layer is processed by the ROI pooling layer. The ROI output size specifies the size of the feature maps output by the ROI pooling layer.
Network Name | Feature Extraction Layer Name | ROI Pooling Layer OutputSize | Description |
---|---|---|---|
alexnet (Deep Learning Toolbox) | 'relu5' | [6 6] | Last max pooling layer is replaced by ROI max pooling layer |
vgg16 (Deep Learning Toolbox) | 'relu5_3' | [7 7] | |
vgg19 (Deep Learning Toolbox) | 'relu5_4' | ||
squeezenet (Deep Learning Toolbox) | 'fire5-concat' | [14 14] | |
resnet18 (Deep Learning Toolbox) | 'res4b_relu' | ROI pooling layer is inserted after the feature extraction layer. | |
resnet50 (Deep Learning Toolbox) | 'activation_40_relu' | ||
resnet101 (Deep Learning Toolbox) | 'res4b22_relu' | ||
googlenet (Deep Learning Toolbox) | 'inception_4d-output' | ||
mobilenetv2 (Deep Learning Toolbox) | 'block_13_expand_relu' | ||
inceptionv3 (Deep Learning Toolbox) | 'mixed7' | [17 17] | |
inceptionresnetv2 (Deep Learning Toolbox) | 'block17_20_ac' |
To modify and transform a network into a Fast R-CNN network, see Design an R-CNN, Fast R-CNN, and a Faster R-CNN Model.
Use the trainingOptions
(Deep Learning Toolbox) function to enable or
disable verbose printing.
[1] Girshick, Ross. "Fast R-CNN." Proceedings of the IEEE International Conference on Computer Vision. 2015.
[2] Zitnick, C. Lawrence, and Piotr Dollar. "Edge Boxes: Locating Object Proposals From Edges." Computer Vision-ECCV 2014. Springer International Publishing, 2014, pp. 391–405.
estimateAnchorBoxes
| objectDetectorTrainingData
| trainFasterRCNNObjectDetector
| trainRCNNObjectDetector
| trainingOptions
(Deep Learning Toolbox)boxLabelDatastore
| fastRCNNObjectDetector
| Layer
(Deep Learning Toolbox) | SeriesNetwork
(Deep Learning Toolbox)