This example shows how to create a simple long short-term memory (LSTM) classification network using Deep Network Designer.
To train a deep neural network to classify sequence data, you can use an LSTM network. An LSTM network is a type of recurrent neural network (RNN) that learns long-term dependencies between time steps of sequence data.
The example demonstrates how to:
Load sequence data.
Construct the network architecture interactively.
Specify training options.
Train the network.
Predict the labels of new data and calculate the classification accuracy.
Load the Japanese Vowels data set, as described in [1] and [2]. The predictors are cell arrays containing sequences of varying length with a feature dimension of 12. The labels are categorical vectors of labels 1,2,...,9.
[XTrain,YTrain] = japaneseVowelsTrainData; [XValidation,YValidation] = japaneseVowelsTestData;
View the sizes of the first few training sequences. The sequences are matrices with 12 rows (one row for each feature) and a varying number of columns (one column for each time step).
XTrain(1:5)
ans=5×1 cell array
{12×20 double}
{12×26 double}
{12×22 double}
{12×20 double}
{12×21 double}
Open Deep Network Designer.
deepNetworkDesigner
Select Blank Network.
Drag a sequenceInputLayer
to the canvas and set the InputSize
to 12
, to match the feature dimension.
Then, drag an lstmLayer
to the canvas. Set NumHiddenUnits
to 100
and OutputMode
to last
.
Next, drag a fullyConnectedLayer
onto the canvas and set OutputSize
to 9
, the number of classes.
Finally, drag a softmaxLayer
and a classificationLayer
onto the canvas. Connect your layers to create a series network.
To check the network and examine more details of the layers, click Analyze. If the Deep Learning Network Analyzer reports zero errors, then the edited network is ready for training.
To export the network architecture, on the Designer tab, click Export. Deep Network Designer saves the network as the variable layers_1
.
You can also generate code to construct the network architecture by selecting Export > Generate Code.
Specify the training options and train the network.
Because the mini-batches are small with short sequences, the CPU is better suited for training. Set 'ExecutionEnvironment'
to 'cpu'
. To train on a GPU, if available, set 'ExecutionEnvironment'
to 'auto'
(the default value).
miniBatchSize = 27; options = trainingOptions('adam', ... 'ExecutionEnvironment','cpu', ... 'MaxEpochs',100, ... 'MiniBatchSize',miniBatchSize, ... 'ValidationData',{XValidation,YValidation}, ... 'GradientThreshold',2, ... 'Shuffle','every-epoch', ... 'Verbose',false, ... 'Plots','training-progress');
Train the network.
net = trainNetwork(XTrain,YTrain,layers_1,options);
Classify the test data and calculate the classification accuracy. Specify the same mini-batch size as for training.
YPred = classify(net,XValidation,'MiniBatchSize',miniBatchSize);
acc = mean(YPred == YValidation)
acc = 0.9405
For next steps, you can try improving the accuracy by using bidirectional LSTM (BiLSTM) layers or by creating a deeper network. For more information, see Long Short-Term Memory Networks.
For an example showing how to use convolutional networks to classify sequence data, see Speech Command Recognition Using Deep Learning.
Kudo, Mineichi, Jun Toyama, and Masaru Shimbo. “Multidimensional Curve Classification Using Passing-through Regions.” Pattern Recognition Letters 20, no. 11–13 (November 1999): 1103–11. https://doi.org/10.1016/S0167-8655(99)00077-X.
Kudo, Mineichi, Jun Toyama, and Masaru Shimbo. Japanese Vowels Data Set. Distributed by UCI Machine Learning Repository. https://archive.ics.uci.edu/ml/datasets/Japanese+Vowels