lstm

Long short-term memory

Syntax

dlY = lstm(dlX,H0,C0,weights,recurrentWeights,bias)

[dlY,hiddenState,cellState] = lstm(dlX,H0,C0,weights,recurrentWeights,bias)

[___] = lstm(___,'DataFormat',FMT)

Description

The long short-term memory (LSTM) operation allows a network to learn long-term dependencies between time steps in time series and sequence data.

Note

This function applies the deep learning LSTM operation to dlarray data. If you want to apply an LSTM operation within a layerGraph object or Layer array, use the following layer:

lstmLayer

example

dlY = lstm(dlX,H0,C0,weights,recurrentWeights,bias) applies a long short-term memory (LSTM) calculation to input dlX using the initial hidden state H0, initial cell state C0, and parameters weights, recurrentWeights, and bias. The input dlX is a formatted dlarray with dimension labels. The output dlY is a formatted dlarray with the same dimension labels as dlX, except for any 'S' dimensions.

The lstm function updates the cell and hidden states using the hyperbolic tangent function (tanh) as the state activation function. The lstm function uses the sigmoid function given by $σ (x) = {(1 + e^{- x})}^{- 1}$ as the gate activation function.

[dlY,hiddenState,cellState] = lstm(dlX,H0,C0,weights,recurrentWeights,bias) also returns the hidden state and cell state after the LSTM operation.

[___] = lstm(___,'DataFormat',FMT) also specifies the dimension format FMT when dlX is not a formatted dlarray. The output dlY is an unformatted dlarray with the same dimension order as dlX, except for any 'S' dimensions.

Examples

collapse all

Apply LSTM Operation to Sequence Data

Open Live Script

Perform an LSTM operation using three hidden units.

Create the input sequence data as 32 observations with 10 channels and a sequence length of 64

numFeatures = 10;
numObservations = 32;
sequenceLength = 64;

X = randn(numFeatures,numObservations,sequenceLength);
dlX = dlarray(X,'CBT');

Create the initial hidden and cell states with three hidden units. Use the same initial hidden state and cell state for all observations.

numHiddenUnits = 3;
H0 = zeros(numHiddenUnits,1);
C0 = zeros(numHiddenUnits,1);

Create the learnable parameters for the LSTM operation.

weights = dlarray(randn(4*numHiddenUnits,numFeatures),'CU');
recurrentWeights = dlarray(randn(4*numHiddenUnits,numHiddenUnits),'CU');
bias = dlarray(randn(4*numHiddenUnits,1),'C');

Perform the LSTM calculation

[dlY,hiddenState,cellState] = lstm(dlX,H0,C0,weights,recurrentWeights,bias);

View the size and dimensions of dlY.

size(dlY)

ans = 1×3

     3    32    64

dlY.dims

ans = 
'CBT'

View the size of hiddenState and cellState.

size(hiddenState)

ans = 1×2

     3    32

size(cellState)

ans = 1×2

     3    32

Check that the output hiddenState is the same as the last time step of output dlY.

if extractdata(dlY(:,:,end)) == hiddenState
   disp("The hidden state and the last time step are equal.");
else 
   disp("The hidden state and the last time step are not equal.")
end

The hidden state and the last time step are equal.

You can use the hidden state and cell state to keep track of the state of the LSTM operation and input further sequential data.

Input Arguments

collapse all

`dlX` — Input data
`dlarray` | numeric array

Input data, specified as a dlarray with or without dimension labels or a numeric array. When dlX is not a formatted dlarray, you must specify the dimension label format using 'DataFormat',FMT. If dlX is a numeric array, at least one of H0, C0, weights, recurrentWeights, or bias must be a dlarray.

dlX must contain a sequence dimension labeled 'T'. If dlX has any spatial dimensions labeled 'S', they are flattened into the 'C' channel dimensions. If dlX has any unspecified dimensions labeled 'U', they must be singleton.