This example shows how to create and train a simple semantic segmentation network using Deep Network Designer.
Semantic segmentation describes the process of associating each pixel of an image with a class label (such as flower, person, road, sky, ocean, or car). Applications for semantic segmentation include road segmentation for autonomous driving and cancer cell segmentation for medical diagnosis. To learn more, see Getting Started with Semantic Segmentation Using Deep Learning (Computer Vision Toolbox).
To train a semantic segmentation network, you need a collection of images and its corresponding collection of pixel-labeled images. A pixel-labeled image is an image where every pixel value represents the categorical label of that pixel. This example uses a simple data set of 32-by-32 images of triangles for illustration purposes. You can interactively label pixels and export the label data for computer vision applications using Image Labeler (Computer Vision Toolbox). For more information on creating training data for semantic segmentation applications, see Label Pixels for Semantic Segmentation (Computer Vision Toolbox).
Load the training data.
dataFolder = fullfile(toolboxdir('vision'), ... 'visiondata','triangleImages'); imageDir = fullfile(dataFolder,'trainingImages'); labelDir = fullfile(dataFolder,'trainingLabels');
Create an ImageDatastore
containing the images.
imds = imageDatastore(imageDir);
Create a PixelLabelDatastore
containing the ground truth pixel labels. This data set has two classes: "triangle"
and "background"
.
classNames = ["triangle","background"]; labelIDs = [255 0]; pxds = pixelLabelDatastore(labelDir,classNames,labelIDs);
Combine the image datastore and the pixel label datastore into a CombinedDatastore
object using the combine
function. A combined datastore maintains parity between the pair of images in the underlying datastores.
cds = combine(imds,pxds);
Open Deep Network Designer.
deepNetworkDesigner
In Deep Network Designer, you can build, edit, and train deep learning networks. Pause on Blank Network and click New.
Create a semantic segmentation network by dragging layers from the Layer Library to the Designer pane.
Connect the layers in this order:
imageInputLayer
with InputSize
set to 32,32,1
convolution2dLayer
with FilterSize
set to 3,3
, NumFilters
set to 64
, and Padding
set to 1,1,1,1
reluLayer
maxPooling2dLayer
with PoolSize
set to 2,2
, Stride
set to 2,2
, and Padding
set to 0,0,0,0
convolution2dLayer
with FilterSize
set to 3,3
, NumFilters
set to 64
, and Padding
set to 1,1,1,1
reluLayer
transposedConv2dLayer
with FilterSize
set to 4,4
, NumFilters
set to 64
, Stride
set to 2,2
, and Cropping
set to 1,1,1,1
convolution2dLayer
with FilterSize
set to 1,1
, NumFilters
set to 2
, and Padding
set to 0,0,0,0
softmaxLayer
pixelClassificationLayer
You can also create this network at the command line and then import the network into Deep Network Designer using deepNetworkDesigner(layers)
.
layers = [ imageInputLayer([32 32 1]) convolution2dLayer([3,3],64,'Padding',[1,1,1,1]) reluLayer() maxPooling2dLayer([2,2],'Stride',[2,2]) convolution2dLayer([3,3],64,'Padding',[1,1,1,1]) reluLayer() transposedConv2dLayer([4,4],64,'Stride',[2,2],'Cropping',[1,1,1,1]) convolution2dLayer([1,1],2) softmaxLayer() pixelClassificationLayer() ];
This network is a simple semantic segmentation network based on a downsampling and upsampling design. For more information on constructing a semantic segmentation network, see Create a Semantic Segmentation Network (Computer Vision Toolbox).
To import the training datastore, on the Data tab, select Import Data > Import Datastore. Select the CombinedDatastore
object cds
as the training data. For the validation data, select None
. Import the training data by clicking Import.
Deep Network Designer displays a preview of the imported semantic segmentation data. The preview displays the training images and the ground truth pixel labels. The network requires input images (left) and returns a classification for each pixel as either triangle or background (right).
Set the training options and train the network.
On the Training tab, click Training Options. Set InitialLearnRate to 0.001
, MaxEpochs to 100
, and MiniBatchSize to 64
. Set the training options by clicking Close.
Train the network by clicking Train.
Once training is complete, click Export to export the trained network to the workspace. The trained network is stored in the variable trainedNetwork_1
.
Make predictions using test data and the trained network.
Segment the test image using semanticseg
. Display the labels over the image by using the labeloverlay
function.
imgTest = imread('triangleTest.jpg');
testSeg = semanticseg(imgTest,trainedNetwork_1);
testImageSeg = labeloverlay(imgTest,testSeg);
Display the results.
figure imshow(testImageSeg)
The network successfully labels the triangles in the test image.
The semantic segmentation network trained in this example is very simple. To construct more complex semantic segmentation networks, you can use the Computer Vision Toolbox functions segnetLayers
(Computer Vision Toolbox), deeplabv3plusLayers
(Computer Vision Toolbox), and unetLayers
(Computer Vision Toolbox). For an example showing how to use the deeplabv3plusLayers
function to create a DeepLab v3+ network, see Semantic Segmentation With Deep Learning (Computer Vision Toolbox).
Deep Network
Designer | trainingOptions
| deeplabv3plusLayers
(Computer Vision Toolbox) | Image
Labeler (Computer Vision Toolbox) | pixelClassificationLayer
(Computer Vision Toolbox) | pixelLabelDatastore
(Computer Vision Toolbox) | segnetLayers
(Computer Vision Toolbox) | semanticseg
(Computer Vision Toolbox) | unetLayers
(Computer Vision Toolbox) | labeloverlay
(Image Processing Toolbox)