fcnLayers

Create fully convolutional network layers for semantic segmentation

Description

example

lgraph = fcnLayers(imageSize,numClasses) returns a fully convolutional network (FCN), configured as FCN 8s, for semantic segmentation. The FCN is preinitialized using layers and weights from the VGG-16 network.

fcnLayers includes a pixelClassificationLayer to predict the categorical label for every pixel in an input image. The pixel classification layer only supports RGB images.

This function requires the Deep Learning Toolbox™ Model for VGG-16 Network support package. If this support package is not installed, then the vgg16 function provides a download link.

lgraph = fcnLayers(imageSize,numClasses,'Type',type) returns an FCN configured as a type specified by type.

Examples

collapse all

Define the image size and number of classes, then create the network.

imageSize = [480 640];
numClasses = 5;
lgraph = fcnLayers(imageSize,numClasses)

Display the network.

plot(lgraph)

Create a FCN 16s.

imageSize = [480 640];
numClasses = 5;
lgraph = fcnLayers(imageSize,numClasses,'Type','16s')

Display the network.

plot(lgraph)

Input Arguments

collapse all

Network input image size, specified as a 2-element vector in the format [height, width]. The minimum image size is [224 224] because an FCN is based on the VGG-16 network.

Number of classes in the semantic segmentation, specified as an integer greater than 1.

Type of FCN model, specified as one of the following:

FCN ModelDescription
'32s'

Upsamples the final feature map by a factor of 32. This option provides coarse segmentation with a lower computational cost.

'16s'

Upsamples the final feature map by a factor of 16 after fusing the feature map from the fourth pooling layer. This additional information from earlier layers provides medium-grain segmentation at the cost of additional computation.

'8s'

Upsamples the final feature map by a factor of 8 after fusing feature maps from the third and fourth max pooling layers. This additional information from earlier layers provides finer-grain segmentation at the cost of additional computation.

Output Arguments

collapse all

Layers that represent the FCN network architecture, returned as a layerGraph object.

All transposed convolution layers are initialized using bilinear interpolation weights. All transposed convolution layer bias terms are fixed to zero.

Tips

References

[1] Long, J., E. Shelhamer, and T. Darrell. "Fully Convolutional Networks for Semantic Segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440.

Introduced in R2017b