To create a custom layer that itself defines a layer graph, you can specify a
dlnetwork
object as a learnable parameter. This is known as
network composition. You can use network composition to:
Create a single custom layer that represents a block of learnable layers. For example, a residual block.
Create networks with control flow. For example, where a section of the network can dynamically change depending on the input data.
Create networks with loops. For example, where sections of the network feed its output back into itself.
For an example showing how to define a custom layer containing a learnable
dlnetwork
object, see Define Nested Deep Learning Layer.
For an example showing how to train a network with nested layers, see Train Deep Learning Network with Nested Layers.
When specifying a dlnetwork
object as a learnable parameter, the
dlnetwork
object must have an input layer.
Because you must specify the input size of the input layer of the
dlnetwork
object, you may need to specify the input size when
creating the layer. This example code shows how to initialize the input size of the
dlnetwork
object using the constructor function input argument
inputSize
.
function layer = myLayer(inputSize) % Initialize layer properties. ... % Define network. layers = [ imageInputLayer(inputSize,'Normalization','none') % Other network layers go here. ]; lgraph = layerGraph(lgraph); layer.Network = dlnetwork(lgraph); end
To help determine the input size to the layer, you can use the analyzeNetwork
function and check the size of the activations of the
previous layer.
Some layers behave differently during training and during prediction. For example, a dropout layer only performs dropout during training and has no effect during prediction.
when implementing the predict
and optionally the
forward
functions of the custom layer, to ensure that the
layers in the dlnetwork
object behave in the correct way, use the
predict
and forward
functions for
dlnetwork
objects, respectively.
Custom layers with learnable dlnetwork
objects do not support custom backward functions.
You must still assign a value to the memory output argument of the
forward
function.
This example code shows how to use the predict and forward functions with
dlnetwork
input.
function Z = predict(layer,X) % Convert input data to formatted dlarray. X = dlarray(X,'SSCB'); % Predict using network. dlnet = layer.Network; Z = predict(dlnet,X); % Strip dimension labels. Z = stripdims(Z); end function [Z,memory] = forward(layer,X) % Convert input data to formatted dlarray. X = dlarray(X,'SSCB'); % Forward pass using network. dlnet = layer.Network; Z = forward(dlnet,X); % Strip dimension labels. Z = stripdims(Z); memory = []; end
If the dlnetwork
object does not behave differently during training
and prediction, then you can omit the forward function. In this case, the software uses
the predict
function during training.
Custom layers support dlnetwork
objects that do not require state
updates. This means that the dlnetwork
object must not contain layers
that have a state. For example, batch normalization and LSTM layers.
This list shows the built-in layers that fully support network composition.
Layer | Description |
---|---|
An image input layer inputs 2-D images to a network and applies data normalization. | |
A 3-D image input layer inputs 3-D images or volumes to a network and applies data normalization. | |
A sequence input layer inputs sequence data to a network. | |
A feature input layer inputs feature data into a network and applies data normalization. Use this layer when you have a data set of numeric scalars representing features (data without spatial or time dimensions). |
Layer | Description |
---|---|
A 2-D convolutional layer applies sliding convolutional filters to the input. | |
A 3-D convolutional layer applies sliding cuboidal convolution filters to three-dimensional input. | |
A 2-D grouped convolutional layer separates the input channels into groups and applies sliding convolutional filters. Use grouped convolutional layers for channel-wise separable (also known as depth-wise separable) convolution. | |
A transposed 2-D convolution layer upsamples feature maps. | |
A transposed 3-D convolution layer upsamples three-dimensional feature maps. | |
A fully connected layer multiplies the input by a weight matrix and then adds a bias vector. |
Layer | Description |
---|---|
A group normalization layer divides the channels of the input data into groups and normalizes the activations across each group. To speed up training of convolutional neural networks and reduce the sensitivity to network initialization, use group normalization layers between convolutional layers and nonlinearities, such as ReLU layers. You can perform instance normalization and layer normalization by setting the appropriate number of groups. | |
A channel-wise local response (cross-channel) normalization layer carries out channel-wise normalization. | |
A dropout layer randomly sets input elements to zero with a given probability. | |
A 2-D crop layer applies 2-D cropping to the input. |
Layer | Description |
---|---|
An average pooling layer performs down-sampling by dividing the input into rectangular pooling regions and computing the average values of each region. | |
A 3-D average pooling layer performs down-sampling by dividing three-dimensional input into cuboidal pooling regions and computing the average values of each region. | |
A global average pooling layer performs down-sampling by computing the mean of the height and width dimensions of the input. | |
A 3-D global average pooling layer performs down-sampling by computing the mean of the height, width, and depth dimensions of the input. | |
A max pooling layer performs down-sampling by dividing the input into rectangular pooling regions, and computing the maximum of each region. | |
A 3-D max pooling layer performs down-sampling by dividing three-dimensional input into cuboidal pooling regions, and computing the maximum of each region. | |
A global max pooling layer performs down-sampling by computing the maximum of the height and width dimensions of the input. | |
A 3-D global max pooling layer performs down-sampling by computing the maximum of the height, width, and depth dimensions of the input. | |
A max unpooling layer unpools the output of a max pooling layer. |
Layer | Description |
---|---|
An addition layer adds inputs from multiple neural network layers element-wise. | |
A multiplication layer multiplies inputs from multiple neural network layers element-wise. | |
A depth concatenation layer takes inputs that have the same height and width and concatenates them along the third dimension (the channel dimension). | |
A concatenation layer takes inputs and concatenates them along a specified dimension. The inputs must have the same size in all dimensions except the concatenation dimension. |
If the layer forward functions fully support dlarray
objects, then the layer
is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs
and return outputs of type gpuArray
(Parallel Computing Toolbox).
Many MATLAB® built-in functions support gpuArray
(Parallel Computing Toolbox) and dlarray
input arguments. For a list of
functions that support dlarray
objects, see List of Functions with dlarray Support. For a list of functions
that execute on a GPU, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
To use a GPU for deep
learning, you must also have a CUDA® enabled NVIDIA® GPU with compute capability 3.0 or higher. For more information on working with GPUs in MATLAB, see GPU Computing in MATLAB (Parallel Computing Toolbox).
analyzeNetwork
| checkLayer
| trainingOptions
| trainNetwork