Deep Learning Network Composition

To create a custom layer that itself defines a layer graph, you can specify a dlnetwork object as a learnable parameter. This is known as network composition. You can use network composition to:

Create a single custom layer that represents a block of learnable layers. For example, a residual block.
Create networks with control flow. For example, where a section of the network can dynamically change depending on the input data.
Create networks with loops. For example, where sections of the network feed its output back into itself.

For an example showing how to define a custom layer containing a learnable dlnetwork object, see Define Nested Deep Learning Layer.

For an example showing how to train a network with nested layers, see Train Deep Learning Network with Nested Layers.

Specify Input Size

When specifying a dlnetwork object as a learnable parameter, the dlnetwork object must have an input layer.

Because you must specify the input size of the input layer of the dlnetwork object, you may need to specify the input size when creating the layer. This example code shows how to initialize the input size of the dlnetwork object using the constructor function input argument inputSize.

function layer = myLayer(inputSize)
    
    % Initialize layer properties.
    ...

    % Define network.
    layers = [
        imageInputLayer(inputSize,'Normalization','none')
        % Other network layers go here.
        ];

    lgraph = layerGraph(lgraph);
    layer.Network = dlnetwork(lgraph);
end

To help determine the input size to the layer, you can use the analyzeNetwork function and check the size of the activations of the previous layer.

Predict and Forward Functions

Some layers behave differently during training and during prediction. For example, a dropout layer only performs dropout during training and has no effect during prediction.

when implementing the predict and optionally the forward functions of the custom layer, to ensure that the layers in the dlnetwork object behave in the correct way, use the predict and forward functions for dlnetwork objects, respectively.

Custom layers with learnable dlnetwork objects do not support custom backward functions.

You must still assign a value to the memory output argument of the forward function.

This example code shows how to use the predict and forward functions with dlnetwork input.

function Z = predict(layer,X)

    % Convert input data to formatted dlarray.
    X = dlarray(X,'SSCB');

    % Predict using network.
    dlnet = layer.Network;
    Z = predict(dlnet,X);
            
    % Strip dimension labels.
    Z = stripdims(Z);
end

function [Z,memory] = forward(layer,X)

    % Convert input data to formatted dlarray.
    X = dlarray(X,'SSCB');

    % Forward pass using network.
    dlnet = layer.Network;
    Z = forward(dlnet,X);
            
    % Strip dimension labels.
    Z = stripdims(Z);

    memory = [];
end

If the dlnetwork object does not behave differently during training and prediction, then you can omit the forward function. In this case, the software uses the predict function during training.

Supported Layers

Custom layers support dlnetwork objects that do not require state updates. This means that the dlnetwork object must not contain layers that have a state. For example, batch normalization and LSTM layers.

This list shows the built-in layers that fully support network composition.

Input Layers

Layer	Description
`imageInputLayer`	An image input layer inputs 2-D images to a network and applies data normalization.
`image3dInputLayer`	A 3-D image input layer inputs 3-D images or volumes to a network and applies data normalization.
`sequenceInputLayer`	A sequence input layer inputs sequence data to a network.
`featureInputLayer`	A feature input layer inputs feature data into a network and applies data normalization. Use this layer when you have a data set of numeric scalars representing features (data without spatial or time dimensions).

Convolution and Fully Connected Layers

Layer	Description
`convolution2dLayer`	A 2-D convolutional layer applies sliding convolutional filters to the input.
`convolution3dLayer`	A 3-D convolutional layer applies sliding cuboidal convolution filters to three-dimensional input.
`groupedConvolution2dLayer`	A 2-D grouped convolutional layer separates the input channels into groups and applies sliding convolutional filters. Use grouped convolutional layers for channel-wise separable (also known as depth-wise separable) convolution.
`transposedConv2dLayer`	A transposed 2-D convolution layer upsamples feature maps.
`transposedConv3dLayer`	A transposed 3-D convolution layer upsamples three-dimensional feature maps.
`fullyConnectedLayer`	A fully connected layer multiplies the input by a weight matrix and then adds a bias vector.

Normalization, Dropout, and Cropping Layers

Layer	Description
`groupNormalizationLayer`	A group normalization layer divides the channels of the input data into groups and normalizes the activations across each group. To speed up training of convolutional neural networks and reduce the sensitivity to network initialization, use group normalization layers between convolutional layers and nonlinearities, such as ReLU layers. You can perform instance normalization and layer normalization by setting the appropriate number of groups.
`crossChannelNormalizationLayer`	A channel-wise local response (cross-channel) normalization layer carries out channel-wise normalization.
`dropoutLayer`	A dropout layer randomly sets input elements to zero with a given probability.
`crop2dLayer`	A 2-D crop layer applies 2-D cropping to the input.

Pooling and Unpooling Layers

Layer	Description
`averagePooling2dLayer`	An average pooling layer performs down-sampling by dividing the input into rectangular pooling regions and computing the average values of each region.
`averagePooling3dLayer`	A 3-D average pooling layer performs down-sampling by dividing three-dimensional input into cuboidal pooling regions and computing the average values of each region.
`globalAveragePooling2dLayer`	A global average pooling layer performs down-sampling by computing the mean of the height and width dimensions of the input.
`globalAveragePooling3dLayer`	A 3-D global average pooling layer performs down-sampling by computing the mean of the height, width, and depth dimensions of the input.
`maxPooling2dLayer`	A max pooling layer performs down-sampling by dividing the input into rectangular pooling regions, and computing the maximum of each region.
`maxPooling3dLayer`	A 3-D max pooling layer performs down-sampling by dividing three-dimensional input into cuboidal pooling regions, and computing the maximum of each region.
`globalMaxPooling2dLayer`	A global max pooling layer performs down-sampling by computing the maximum of the height and width dimensions of the input.
`globalMaxPooling3dLayer`	A 3-D global max pooling layer performs down-sampling by computing the maximum of the height, width, and depth dimensions of the input.
`maxUnpooling2dLayer`	A max unpooling layer unpools the output of a max pooling layer.

Combination Layers

Layer	Description
`additionLayer`	An addition layer adds inputs from multiple neural network layers element-wise.
`multiplicationLayer`	A multiplication layer multiplies inputs from multiple neural network layers element-wise.
`depthConcatenationLayer`	A depth concatenation layer takes inputs that have the same height and width and concatenates them along the third dimension (the channel dimension).
`concatenationLayer`	A concatenation layer takes inputs and concatenates them along a specified dimension. The inputs must have the same size in all dimensions except the concatenation dimension.

GPU Compatibility

If the layer forward functions fully support dlarray objects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of type gpuArray (Parallel Computing Toolbox).

Many MATLAB^® built-in functions support gpuArray (Parallel Computing Toolbox) and dlarray input arguments. For a list of functions that support dlarray objects, see List of Functions with dlarray Support. For a list of functions that execute on a GPU, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox). To use a GPU for deep learning, you must also have a CUDA^® enabled NVIDIA^® GPU with compute capability 3.0 or higher. For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).

Documentation