sequenceInputLayer

Sequence input layer

expand all in page

Description

A sequence input layer inputs sequence data to a network.

Creation

Syntax

layer = sequenceInputLayer(inputSize)

layer = sequenceInputLayer(inputSize,Name,Value)

Description

layer = sequenceInputLayer(inputSize) creates a sequence input layer and sets the InputSize property.

example

layer = sequenceInputLayer(inputSize,Name,Value) sets the optional Normalization, Mean, and Name properties using name-value pairs. You can specify multiple name-value pairs. Enclose each property name in single quotes.

Properties

expand all

Image Input

`InputSize` — Size of input
positive integer | vector of positive integers

Size of the input, specified as a positive integer or a vector of positive integers.

For vector sequence input, InputSize is a scalar corresponding to the number of features.
For 2-D image sequence input, InputSize is vector of three elements [h w c], where h is the image height, w is the image width, and c is the number of channels of the image.
For 3-D image sequence input, InputSize is vector of four elements [h w d c], where h is the image height, w is the image width, d is the image depth, and c is the number of channels of the image.

Example: 100

`Normalization` — Data normalization
`'none'` (default) | `'zerocenter'` | `'zscore'` | `'rescale-symmetric'` | `'rescale-zero-one'` | function handle

Data normalization to apply every time data is forward propagated through the input layer, specified as one of the following:

'zerocenter' — Subtract the mean specified by Mean.
'zscore' — Subtract the mean specified by Mean and divide by StandardDeviation.
'rescale-symmetric' — Rescale the input to be in the range [-1, 1] using the minimum and maximum values specified by Min and Max, respectively.
'rescale-zero-one' — Rescale the input to be in the range [0, 1] using the minimum and maximum values specified by Min and Max, respectively.
'none' — Do not normalize the input data.
function handle — Normalize the data using the specified function. The function must be of the form Y = func(X), where X is the input data, and the output Y is the normalized data.

Tip

The software, by default, automatically calculates the normalization statistics at training time. To save time when training, specify the required statistics for normalization and set the 'ResetInputNormalization' option in trainingOptions to false.

If the input data contains padding, then the layer ignored padding values when normalizing the input data.

`NormalizationDimension` — Normalization dimension
`'auto'` (default) | `'channel'` | `'element'` | `'all'`

Normalization dimension, specified as one of the following:

'auto' – If the training option is false and you specify any of the normalization statistics (Mean, StandardDeviation, Min, or Max), then normalize over the dimensions matching the statistics. Otherwise, recalculate the statistics at training time and apply channel-wise normalization.
'channel' – Channel-wise normalization.
'element' – Element-wise normalization.
'all' – Normalize all values using scalar statistics.

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | numeric array | numeric scalar

Mean for zero-center and z-score normalization, specified as a numeric array, or empty.

For vector sequence input, Mean must be a InputSize-by-1 vector of means per channel, a numeric scalar, or [].
For 2-D image sequence input, Mean must be a numeric array of the same size as InputSize, a 1-by-1-by-InputSize(3) array of means per channel, a numeric scalar, or [].
For 3-D image sequence input, Mean must be a numeric array of the same size as InputSize, a 1-by-1-by-1-by-InputSize(4) array of means per channel, a numeric scalar, or [].

If you specify the Mean property, then Normalization must be 'zerocenter' or 'zscore'. If Mean is [], then the software calculates the mean at training time.

You can set this property when creating networks without training (for example, when assembling networks using assembleNetwork).

`StandardDeviation` — Standard deviation
`[]` (default) | numeric array | numeric scalar

Standard deviation used for z-score normalization, specified as a numeric array, a numeric scalar, or empty.

For vector sequence input, StandardDeviation must be a InputSize-by-1 vector of standard deviations per channel, a numeric scalar, or [].
For 2-D image sequence input, StandardDeviation must be a numeric array of the same size as InputSize, a 1-by-1-by-InputSize(3) array of standard deviations per channel, a numeric scalar, or [].
For 3-D image sequence input, StandardDeviation must be a numeric array of the same size as InputSize, a 1-by-1-by-1-by-InputSize(4) array of standard deviations per channel, or a numeric scalar.

If you specify the StandardDeviation property, then Normalization must be 'zscore'. If StandardDeviation is [], then the software calculates the standard deviation at training time.

You can set this property when creating networks without training (for example, when assembling networks using assembleNetwork).

`Min` — Minimum value for rescaling
`[]` (default) | numeric array | numeric scalar

Minimum value for rescaling, specified as a numeric array, or empty.

For vector sequence input, Min must be a InputSize-by-1 vector of means per channel or a numeric scalar.
For 2-D image sequence input, Min must be a numeric array of the same size as InputSize, a 1-by-1-by-InputSize(3) array of minima per channel, or a numeric scalar.
For 3-D image sequence input, Min must be a numeric array of the same size as InputSize, a 1-by-1-by-1-by-InputSize(4) array of minima per channel, or a numeric scalar.

If you specify the Min property, then Normalization must be 'rescale-symmetric' or 'rescale-zero-one'. If Min is [], then the software calculates the minima at training time.

You can set this property when creating networks without training (for example, when assembling networks using assembleNetwork).

`Max` — Maximum value for rescaling
`[]` (default) | numeric array | numeric scalar

Maximum value for rescaling, specified as a numeric array, or empty.

For vector sequence input, Max must be a InputSize-by-1 vector of means per channel or a numeric scalar.
For 2-D image sequence input, Max must be a numeric array of the same size as InputSize, a 1-by-1-by-InputSize(3) array of maxima per channel, a numeric scalar, or [].
For 3-D image sequence input, Max must be a numeric array of the same size as InputSize, a 1-by-1-by-1-by-InputSize(4) array of maxima per channel, a numeric scalar, or [].

If you specify the Max property, then Normalization must be 'rescale-symmetric' or 'rescale-zero-one'. If Max is [], then the software calculates the maxima at training time.

You can set this property when creating networks without training (for example, when assembling networks using assembleNetwork).

Layer

`Name` — Layer name
`''` (default) | character vector | string scalar

Layer name, specified as a character vector or a string scalar. To include a layer in a layer graph, you must specify a nonempty unique layer name. If you train a series network with the layer and Name is set to '', then the software automatically assigns a name to the layer at training time.

Data Types: char | string

`NumInputs` — Number of inputs
0 (default)

Number of inputs of the layer. The layer has no inputs.

Data Types: double

`InputNames` — Input names
`{}` (default)

Input names of the layer. The layer has no inputs.

Data Types: cell

`NumOutputs` — Number of outputs
1 (default)

Number of outputs of the layer. This layer has a single output only.

Data Types: double

`OutputNames` — Output names
`{'out'}` (default)

Output names of the layer. This layer has a single output only.

Data Types: cell

Examples

collapse all

Create Sequence Input Layer

Open Live Script

Create a sequence input layer with the name 'seq1' and an input size of 12.

layer = sequenceInputLayer(12,'Name','seq1')

layer = 
  SequenceInputLayer with properties:

                      Name: 'seq1'
                 InputSize: 12

   Hyperparameters
             Normalization: 'none'
    NormalizationDimension: 'auto'

Include a sequence input layer in a Layer array.

inputSize = 12;
numHiddenUnits = 100;
numClasses = 9;

layers = [ ...
    sequenceInputLayer(inputSize)
    lstmLayer(numHiddenUnits,'OutputMode','last')
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer]

layers = 
  5x1 Layer array with layers:

     1   ''   Sequence Input          Sequence input with 12 dimensions
     2   ''   LSTM                    LSTM with 100 hidden units
     3   ''   Fully Connected         9 fully connected layer
     4   ''   Softmax                 softmax
     5   ''   Classification Output   crossentropyex

Create Sequence Input Layer for Image Sequences

Open Live Script

Create a sequence input layer for sequences of 224-224 RGB images with the name 'seq1'.

layer = sequenceInputLayer([224 224 3], 'Name', 'seq1')

layer = 
  SequenceInputLayer with properties:

                      Name: 'seq1'
                 InputSize: [224 224 3]

   Hyperparameters
             Normalization: 'none'
    NormalizationDimension: 'auto'

Train Network for Sequence Classification

Open Live Script

Train a deep learning LSTM network for sequence-to-label classification.

Load the Japanese Vowels data set as described in [1] and [2]. XTrain is a cell array containing 270 sequences of varying length with 12 features corresponding to LPC cepstrum coefficients. Y is a categorical vector of labels 1,2,...,9. The entries in XTrain are matrices with 12 rows (one row for each feature) and a varying number of columns (one column for each time step).

[XTrain,YTrain] = japaneseVowelsTrainData;

Visualize the first time series in a plot. Each line corresponds to a feature.

figure
plot(XTrain{1}')
title("Training Observation 1")
numFeatures = size(XTrain{1},1);
legend("Feature " + string(1:numFeatures),'Location','northeastoutside')

Define the LSTM network architecture. Specify the input size as 12 (the number of features of the input data). Specify an LSTM layer to have 100 hidden units and to output the last element of the sequence. Finally, specify nine classes by including a fully connected layer of size 9, followed by a softmax layer and a classification layer.

inputSize = 12;
numHiddenUnits = 100;
numClasses = 9;

layers = [ ...
    sequenceInputLayer(inputSize)
    lstmLayer(numHiddenUnits,'OutputMode','last')
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer]

layers = 
  5×1 Layer array with layers:

     1   ''   Sequence Input          Sequence input with 12 dimensions
     2   ''   LSTM                    LSTM with 100 hidden units
     3   ''   Fully Connected         9 fully connected layer
     4   ''   Softmax                 softmax
     5   ''   Classification Output   crossentropyex

Specify the training options. Specify the solver as 'adam' and 'GradientThreshold' as 1. Set the mini-batch size to 27 and set the maximum number of epochs to 70.

Because the mini-batches are small with short sequences, the CPU is better suited for training. Set 'ExecutionEnvironment' to 'cpu'. To train on a GPU, if available, set 'ExecutionEnvironment' to 'auto' (the default value).

maxEpochs = 70;
miniBatchSize = 27;

options = trainingOptions('adam', ...
    'ExecutionEnvironment','cpu', ...
    'MaxEpochs',maxEpochs, ...
    'MiniBatchSize',miniBatchSize, ...
    'GradientThreshold',1, ...
    'Verbose',false, ...
    'Plots','training-progress');

Train the LSTM network with the specified training options.

net = trainNetwork(XTrain,YTrain,layers,options);

Load the test set and classify the sequences into speakers.

[XTest,YTest] = japaneseVowelsTestData;

Classify the test data. Specify the same mini-batch size used for training.

YPred = classify(net,XTest,'MiniBatchSize',miniBatchSize);

Calculate the classification accuracy of the predictions.

acc = sum(YPred == YTest)./numel(YTest)

acc = 0.9514

Classification LSTM Networks

Open Live Script

To create an LSTM network for sequence-to-label classification, create a layer array containing a sequence input layer, an LSTM layer, a fully connected layer, a softmax layer, and a classification output layer.

Set the size of the sequence input layer to the number of features of the input data. Set the size of the fully connected layer to the number of classes. You do not need to specify the sequence length.

For the LSTM layer, specify the number of hidden units and the output mode 'last'.

numFeatures = 12;
numHiddenUnits = 100;
numClasses = 9;
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(numHiddenUnits,'OutputMode','last')
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];

For an example showing how to train an LSTM network for sequence-to-label classification and classify new data, see Sequence Classification Using Deep Learning.

To create an LSTM network for sequence-to-sequence classification, use the same architecture as for sequence-to-label classification, but set the output mode of the LSTM layer to 'sequence'.

numFeatures = 12;
numHiddenUnits = 100;
numClasses = 9;
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(numHiddenUnits,'OutputMode','sequence')
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];

Regression LSTM Networks

Open Live Script

To create an LSTM network for sequence-to-one regression, create a layer array containing a sequence input layer, an LSTM layer, a fully connected layer, and a regression output layer.

Set the size of the sequence input layer to the number of features of the input data. Set the size of the fully connected layer to the number of responses. You do not need to specify the sequence length.

For the LSTM layer, specify the number of hidden units and the output mode 'last'.

numFeatures = 12;
numHiddenUnits = 125;
numResponses = 1;

layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(numHiddenUnits,'OutputMode','last')
    fullyConnectedLayer(numResponses)
    regressionLayer];

To create an LSTM network for sequence-to-sequence regression, use the same architecture as for sequence-to-one regression, but set the output mode of the LSTM layer to 'sequence'.

numFeatures = 12;
numHiddenUnits = 125;
numResponses = 1;

layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(numHiddenUnits,'OutputMode','sequence')
    fullyConnectedLayer(numResponses)
    regressionLayer];

For an example showing how to train an LSTM network for sequence-to-sequence regression and predict on new data, see Sequence-to-Sequence Regression Using Deep Learning.

Deeper LSTM Networks

Open Live Script

You can make LSTM networks deeper by inserting extra LSTM layers with the output mode 'sequence' before the LSTM layer. To prevent overfitting, you can insert dropout layers after the LSTM layers.

For sequence-to-label classification networks, the output mode of the last LSTM layer must be 'last'.

numFeatures = 12;
numHiddenUnits1 = 125;
numHiddenUnits2 = 100;
numClasses = 9;
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(numHiddenUnits1,'OutputMode','sequence')
    dropoutLayer(0.2)
    lstmLayer(numHiddenUnits2,'OutputMode','last')
    dropoutLayer(0.2)
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];

For sequence-to-sequence classification networks, the output mode of the last LSTM layer must be 'sequence'.

numFeatures = 12;
numHiddenUnits1 = 125;
numHiddenUnits2 = 100;
numClasses = 9;
layers = [ ...
    sequenceInputLayer(numFeatures)
    lstmLayer(numHiddenUnits1,'OutputMode','sequence')
    dropoutLayer(0.2)
    lstmLayer(numHiddenUnits2,'OutputMode','sequence')
    dropoutLayer(0.2)
    fullyConnectedLayer(numClasses)
    softmaxLayer
    classificationLayer];

Create Network for Video Classification

Open Live Script

Create a deep learning network for data containing sequences of images, such as video and medical image data.

To input sequences of images into a network, use a sequence input layer.
To apply convolutional operations independently to each time step, first convert the sequences of images to an array of images using a sequence folding layer.
To restore the sequence structure after performing these operations, convert this array of images back to image sequences using a sequence unfolding layer.
To convert images to feature vectors, use a flatten layer.

You can then input vector sequences into LSTM and BiLSTM layers.

Define Network Architecture

Create a classification LSTM network that classifies sequences of 28-by-28 grayscale images into 10 classes.

Define the following network architecture:

A sequence input layer with an input size of [28 28 1].
A convolution, batch normalization, and ReLU layer block with 20 5-by-5 filters.
An LSTM layer with 200 hidden units that outputs the last time step only.
A fully connected layer of size 10 (the number of classes) followed by a softmax layer and a classification layer.

To perform the convolutional operations on each time step independently, include a sequence folding layer before the convolutional layers. LSTM layers expect vector sequence input. To restore the sequence structure and reshape the output of the convolutional layers to sequences of feature vectors, insert a sequence unfolding layer and a flatten layer between the convolutional layers and the LSTM layer.

inputSize = [28 28 1];
filterSize = 5;
numFilters = 20;
numHiddenUnits = 200;
numClasses = 10;

layers = [ ...
    sequenceInputLayer(inputSize,'Name','input')
    
    sequenceFoldingLayer('Name','fold')
    
    convolution2dLayer(filterSize,numFilters,'Name','conv')
    batchNormalizationLayer('Name','bn')
    reluLayer('Name','relu')
    
    sequenceUnfoldingLayer('Name','unfold')
    flattenLayer('Name','flatten')
    
    lstmLayer(numHiddenUnits,'OutputMode','last','Name','lstm')
    
    fullyConnectedLayer(numClasses, 'Name','fc')
    softmaxLayer('Name','softmax')
    classificationLayer('Name','classification')];

Convert the layers to a layer graph and connect the miniBatchSize output of the sequence folding layer to the corresponding input of the sequence unfolding layer.

lgraph = layerGraph(layers);
lgraph = connectLayers(lgraph,'fold/miniBatchSize','unfold/miniBatchSize');

View the final network architecture using the plot function.

figure
plot(lgraph)

Compatibility Considerations

expand all

`sequenceInputLayer`, by default, uses channel-wise normalization for zero-center normalization

Behavior change in future release

Starting in R2019b, sequenceInputLayer, by default, uses channel-wise normalization for zero-center normalization. In previous versions, this layer uses element-wise normalization. To reproduce this behavior, set the NormalizationDimension option of this layer to 'element'.

`sequenceInputLayer` ignores padding values when normalizing

Behavior changed in R2020a

Starting in R2020a, sequenceInputLayer objects ignore padding values in the input data when normalizing. This means that the Normalization option in the sequenceInputLayer now makes training invariant to data operations, for example, 'zerocenter' normalization now implies that the training results are invariant to the mean of the data.

If you train on padded sequences, then the calculated normalization factors may be different in earlier versions and can produce different results.

References

[1] M. Kudo, J. Toyama, and M. Shimbo. "Multidimensional Curve Classification Using Passing-Through Regions." Pattern Recognition Letters. Vol. 20, No. 11–13, pages 1103–1111.

[2] UCI Machine Learning Repository: Japanese Vowels Dataset. https://archive.ics.uci.edu/ml/datasets/Japanese+Vowels

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

For code generation, only vector input types are supported.
For vector sequence inputs, the number of features must be a constant during code generation.
Code generation does not support 'Normalization' specified using a function handle.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

To generate CUDA^® or C++ code by using GPU Coder™, you must first construct and train a deep neural network. Once the network is trained and evaluated, you can configure the code generator to generate code and deploy the convolutional neural network on platforms that use NVIDIA^® or ARM^® GPU processors. For more information, see Deep Learning with GPU Coder (GPU Coder).

For this layer, you can generate code that takes advantage of the NVIDIA CUDA deep neural network library (cuDNN), or the NVIDIA TensorRT™ high performance inference library.

The cuDNN library supports vector and 2-D image sequences. The TensorRT library support only vector input sequences.
For vector sequence inputs, the number of features must be a constant during code generation.
For image sequence inputs, the height, width, and the number of channels must be a constant during code generation.
Code generation does not support 'Normalization' specified using a function handle.

Introduced in R2017b

Documentation

sequenceInputLayer

Description

Creation

Syntax

Description

Properties

Image Input

`InputSize` — Size of input
positive integer | vector of positive integers

`Normalization` — Data normalization
`'none'` (default) | `'zerocenter'` | `'zscore'` | `'rescale-symmetric'` | `'rescale-zero-one'` | function handle

`NormalizationDimension` — Normalization dimension
`'auto'` (default) | `'channel'` | `'element'` | `'all'`

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | numeric array | numeric scalar

`StandardDeviation` — Standard deviation
`[]` (default) | numeric array | numeric scalar

`Min` — Minimum value for rescaling
`[]` (default) | numeric array | numeric scalar

`Max` — Maximum value for rescaling
`[]` (default) | numeric array | numeric scalar

Layer

`Name` — Layer name
`''` (default) | character vector | string scalar

`NumInputs` — Number of inputs
0 (default)

`InputNames` — Input names
`{}` (default)

`NumOutputs` — Number of outputs
1 (default)

`OutputNames` — Output names
`{'out'}` (default)

Examples

Create Sequence Input Layer

Create Sequence Input Layer for Image Sequences

Train Network for Sequence Classification

Classification LSTM Networks

Regression LSTM Networks

Deeper LSTM Networks

Create Network for Video Classification

Compatibility Considerations

`sequenceInputLayer`, by default, uses channel-wise normalization for zero-center normalization

`sequenceInputLayer` ignores padding values when normalizing

References

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

See Also

Topics

Deep Learning Toolbox Documentation

Support

Documentation

sequenceInputLayer

Description

Creation

Syntax

Description

Properties

Image Input

InputSize — Size of input positive integer | vector of positive integers

Normalization — Data normalization 'none' (default) | 'zerocenter' | 'zscore' | 'rescale-symmetric' | 'rescale-zero-one' | function handle

NormalizationDimension — Normalization dimension 'auto' (default) | 'channel' | 'element' | 'all'

Mean — Mean for zero-center and z-score normalization [] (default) | numeric array | numeric scalar

StandardDeviation — Standard deviation [] (default) | numeric array | numeric scalar

Min — Minimum value for rescaling [] (default) | numeric array | numeric scalar

Max — Maximum value for rescaling [] (default) | numeric array | numeric scalar

Layer

Name — Layer name '' (default) | character vector | string scalar

NumInputs — Number of inputs 0 (default)

InputNames — Input names {} (default)

NumOutputs — Number of outputs 1 (default)

OutputNames — Output names {'out'} (default)

Examples

Create Sequence Input Layer

Create Sequence Input Layer for Image Sequences

Train Network for Sequence Classification

Classification LSTM Networks

Regression LSTM Networks

Deeper LSTM Networks

Create Network for Video Classification

Compatibility Considerations

sequenceInputLayer, by default, uses channel-wise normalization for zero-center normalization

sequenceInputLayer ignores padding values when normalizing

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

See Also

Topics

Deep Learning Toolbox Documentation

Support

`InputSize` — Size of input
positive integer | vector of positive integers

`Normalization` — Data normalization
`'none'` (default) | `'zerocenter'` | `'zscore'` | `'rescale-symmetric'` | `'rescale-zero-one'` | function handle

`NormalizationDimension` — Normalization dimension
`'auto'` (default) | `'channel'` | `'element'` | `'all'`

`Mean` — Mean for zero-center and z-score normalization
`[]` (default) | numeric array | numeric scalar

`StandardDeviation` — Standard deviation
`[]` (default) | numeric array | numeric scalar

`Min` — Minimum value for rescaling
`[]` (default) | numeric array | numeric scalar

`Max` — Maximum value for rescaling
`[]` (default) | numeric array | numeric scalar

`Name` — Layer name
`''` (default) | character vector | string scalar

`NumInputs` — Number of inputs
0 (default)

`InputNames` — Input names
`{}` (default)

`NumOutputs` — Number of outputs
1 (default)

`OutputNames` — Output names
`{'out'}` (default)

`sequenceInputLayer`, by default, uses channel-wise normalization for zero-center normalization

`sequenceInputLayer` ignores padding values when normalizing

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.