Create mini-batches for deep learning
Use minibatchqueue
to create, preprocess, and manage
mini-batches of data for training using custom training loops.
A minibatchqueue
iterates over a datastore to provide data in a suitable
format for training using custom training loops. Use a minibatchqueue
to
automatically convert your data to dlarray
or gpuArray
,
convert data to a different precision, or apply a custom function to preprocess your data. You
can prepare your data in parallel in the background.
During training, you can manage your data using the minibatchqueue
. You
can shuffle the data at the start of each training epoch using the shuffle
function
and collect data from the queue for each training iteration using the next
function. You
can check if there is any data left in the queue using the hasdata
function,
and reset
the queue
when it is empty.
creates a mbq
= minibatchqueue(ds
)minibatchqueue
from the input datastore ds
.
The mini-batches in mbq
have the same number of variables as the
results of read
on the input datastore.
creates a mbq
= minibatchqueue(ds
,numOutputs
)minibatchqueue
from the input datastore ds
and sets the number of variables in each mini-batch. Use this syntax when you use
MiniBatchFcn
to specify a mini-batch preprocessing function that
has a different number of outputs than the number of variables of the input datastore
ds
.
ds
— Input datastoreInput datastore, specified as a MATLAB® datastore or a custom datastore.
For more information about datastores for deep learning, see Datastores for Deep Learning.
numOutputs
— Number of mini-batch variablesNumber of mini-batch variables, specified as a positive integer. By default, the number of mini-batch variables is equal to the number of variables of the input datastore.
You can determine the number of variables of the input datastore by examining the
output of read(ds)
. If your datastore returns a table, the number
of variables is the number of variables of the table. If your datastore returns a cell
array, the number of variables is the size of the second dimension of the cell array.
If you use the MiniBatchFcn
name-value pair to specify a
mini-batch preprocessing function that outputs a different number of variables than
the input datastore, you must set numOutputs
to match the number of
outputs of the function.
Example: 2
MiniBatchSize
— Size of mini-batches128
(default) | positive integerThis property is read-only.
Size of mini-batches returned by the next
function, specified as a positive integer. The default value is
128
.
Example: 256
PartialMiniBatch
— Return or discard incomplete mini-batches"return"
(default) | "discard"
Return or discard incomplete mini-batches, specified as "return"
or "discard"
.
If the total number of observations is not exactly divisible by
MiniBatchSize
, the final mini-batch returned by the next
function
can have fewer than MiniBatchSize
observations. This property
specifies how any partial mini-batches are treated, using the following options:
"return"
— A mini-batch can contain fewer than
MiniBatchSize
observations. All data is returned.
"discard"
— All mini-batches must contain exactly
MiniBatchSize
observations. Some data can be discarded from
the queue if there is not enough for a complete mini-batch.
Set PartialMiniBatch
to "discard"
if you
require that all of your mini-batches are the same size.
Example: "discard"
Data Types: char
| string
MiniBatchFcn
— Mini-batch preprocessing function"collate"
(default) | function handleThis property is read-only.
Mini-batch preprocessing function, specified as "collate"
or a
function handle.
The default value of MiniBatchFcn
is
"collate"
. This function concatenates the mini-batch variables into
arrays.
Use a function handle to a custom function to pre-process mini-batches for custom training. This is recommended for one-hot encoding classification labels, padding sequence data, calculating average images, and so on. You must specify a custom function if your data consists of cell arrays containing arrays of different sizes.
If you specify a custom mini-batch preprocessing function, the function must
concatenate each batch of output variables into an array after preprocessing and return
each variable as a separate function output. The function must accept at least as many
inputs as the number of variables of the underlying datastore. The inputs are passed to
the custom function as N-by-1 cell arrays, where N
is the number of observations in the mini-batch. The function can return as many
variables as required. If the function specified by MiniBatchFcn
returns a different number of outputs than inputs, specify numOutputs
as the number of outputs of the function.
The following actions are not recommended inside the custom function. To reproduce
the desired behavior, instead, set the corresponding property when you create the
minibatchqueue
.
Action | Recommended Property |
---|---|
Cast variable to different data type | OutputCast |
Move data to GPU | OutputEnvironment |
Convert data to dlarray | OutputAsDlarray |
Apply data format to dlarray variable | MiniBatchFormat |
Example: @myCustomFunction
Data Types: char
| string
| function_handle
DispatchInBackground
— Preprocess mini-batches in the background in a parallel poolfalse
or 0
(default) | true
or 1
Preprocess mini-batches in the background in a parallel pool, specified as a numeric
or logical 1
(true
) or 0
(false
).
Using this option requires Parallel Computing Toolbox™ The input datastore ds
must be partitionable. Custom
datastores must implement the matlab.io.datastore.Partitionable
class.
Use this option when your mini-batches require heavy preprocessing. This option uses a parallel pool to prepare mini-batches in the background while you use mini-batches during training.
Workers in the pool process mini-batches by applying the function specified by
MiniBatchFcn
. Further processing including applying the effects
of the OutputCast,
OutputEnvironment, OutputAsDlarray, and MiniBatchFormat does not occur on the workers.
When DispatchInBackground
is set to true, the software opens a
local parallel pool using the current settings, if a local pool is not currently open.
Non-local pools are not supported. The pool is opened the first time you call next
.
Example: true
Data Types: logical
OutputCast
— Data type of each mini-batch variable'single'
(default) | 'double'
| 'int8'
| 'int16'
| 'int32'
| 'int64'
| 'uint8'
| 'uint16'
| 'uint32'
| 'uint64'
| 'logical'
| 'char'
| cell arrayThis property is read-only.
Data type of each mini-batch variable, specified as 'single'
,
'double'
, 'int8'
, 'int16'
,
'int32'
, 'int64'
, 'uint8'
,
'uint16'
, 'uint32'
, 'uint64'
,
'logical'
, or 'char'
, or a cell array of these
values, or an empty vector.
If you specify OutputCast
as an empty vector, the data type of
each mini-batch variable is unchanged. To specify a different data type for each
mini-batch variable, specify a cell array containing an entry for each mini-batch
variable. The order of the elements of this cell array must match the order the
mini-batch variables are returned. This order is the same order as the variables are
returned from the function specified by MiniBatchFcn
. If you do not
specify a custom MiniBatchFcn
, it is the same order as the
variables returned by the underlying datastore.
You must make sure that the value of OutputCast
does not
conflict with the values of the OutputAsDlarray or OutputEnvironment properties. If you specify OutputAsDlarray as true
or 1
, check
that the data type specified by OutputCast
is supported by dlarray
. If you
specify OutputEnvironment as "gpu"
or "auto"
and a supported GPU is available, check that the data type specified by
OutputCast
is supported by gpuArray
(Parallel Computing Toolbox).
Example: {'single','single','logical'}
Data Types: char
| string
OutputAsDlarray
— Convert mini-batch variable to dlarray
true
or 1
(default) | false
or 0
| vector of logical valuesThis property is read-only.
Convert mini-batch variable to dlarray
, specified as a numeric or
logical 1
(true
) or 0
(false
) or as a vector of numeric or logical values.
To specify a different value for each output, specify a vector containing an entry
for each mini-batch variable. The order of the elements of this vector must match the
order the mini-batch variable are returned. This order is the same order as the
variables are returned from the function specified by MiniBatchFcn
.
If you do not specify a custom MiniBatchFcn
, it is the same order
as the variables are returned by the underlying datastore.
Variables that are converted to dlarray have underlying data type as specified by the OutputCast property.
Example: [1,1,0]
Data Types: logical
MiniBatchFormat
— Data format of mini-batch variables''
(default) | char array | cell arrayThis property is read-only.
Data format of mini-batch variables, specified as a char array or a cell array of char arrays.
The mini-batch format is applied to dlarray
variables only.
Non-dlarray
mini-batch variables must have a
MiniBatchFormat
of ''
.
To avoid an error when you have a mix of dlarray
and
non-dlarray
variables, you must specify a value for each output by
providing a cell array containing an entry for each mini-batch variable. The order of
the elements of this cell array must match the order the mini-batch variables are
returned. This is the same order as the variables are returned from the function
specified by MiniBatchFcn
. If you do not specify a custom
MiniBatchFcn
, it is the same order as the variables are returned
by the underlying datastore.
Example: {'SSCB', ''}
Data Types: char
| string
OutputEnvironment
— Hardware resource for mini-batch variables'auto'
(default) | 'gpu'
| 'cpu'
| cell arrayHardware resource for mini-batch variables returned using the next
function, specified as one of the following values:
'auto'
— Mini-batch variables are returned on the GPU if
one is available. Otherwise, mini-batch variables are returned on the CPU.
'gpu'
— Mini-batch variables are returned on the
GPU.
'cpu'
— Mini-batch variables are returned on the CPU
To return only specific variables on the GPU, specify
OutputEnvironment
as a cell array containing an entry for each
mini-batch variable. The order of the elements of this cell array must match the order
the mini-batch variable are returned. This order is the same order as the variables are
returned from the function specified by MiniBatchFcn
. If you do not
specify a custom MiniBatchFcn
, it is the same order as the
variables are returned by the underlying datastore.
Using a GPU requires Parallel Computing Toolbox. To use a GPU for deep
learning, you must also have a CUDA® enabled NVIDIA® GPU with compute capability 3.0 or higher. If you choose the 'gpu'
option and Parallel Computing Toolbox or a suitable GPU is not available, then the software returns an
error.
Example: {'gpu','cpu'}
Data Types: char
| string
Use a minibatchqueue
to automatically prepare
mini-batches of images and classification labels for training in a custom training
loop.
Create a datastore. Calling read on auimds
produces a table with
two variables: input
, containing the image data, and
response
, containing the corresponding classification labels.
auimds = augmentedImageDatastore([100 100],digitDatastore); A = read(auimds); head(A,2)
ans = input response _______________ ________ {100×100 uint8} 0 {100×100 uint8} 0
Create a minibatchqueue
from auimds
. Set the
MiniBatchSize
property to 256
.
The minibatchqueue
has two output variables: the images and
classification labels from the input
and response
variables of auimds
, respectively. Set the
minibatchqueue
to return the images as a formatted
dlarray
on the GPU. The images are single channel black and white
images. Add a singleton channel dimension by applying the format
'SSBC'
to the batch. Return the labels as a
non-dlarray
on the CPU.
mbq = minibatchqueue(auimds,... 'MiniBatchSize',256,... 'OutputAsDlarray',[1,0],... 'MiniBatchFormat',{'SSBC',''},... 'OutputEnvironment',{'gpu','cpu'})
Use the next
function to obtain mini-batches from
mbq
.
[X,Y] = next(mbq);
Preprocess data using a minibatchqueue
with a custom mini-batch preprocessing function. The custom function rescales the incoming image data between 0 and 1 and calculates the average image.
Unzip the data and create a datastore.
unzip("MerchData.zip"); imds = imageDatastore("MerchData", ... "IncludeSubfolders",true, ... "LabelSource",'foldernames');
Create a minibatchqueue
that preprocesses data using the custom function preprocessMiniBatch
defined at the end of this example. The custom function concatenates the image data into a numeric array, rescales the image between 0 and 1, and calculates the average of the batch of images. The function returns the rescaled batch of images and the average image. Set the number of outputs to 2
, to match the number of outputs of the function.
mbq = minibatchqueue(imds,2,... 'MiniBatchSize',16,... 'MiniBatchFcn',@preprocessMiniBatch,... 'OutputAsDlarray',0)
mbq = minibatchqueue with 2 outputs and properties: Mini-batch creation: MiniBatchSize: 16 PartialMiniBatch: 'return' MiniBatchFcn: @preprocessMiniBatch DispatchInBackground: 0 Outputs: OutputCast: {'single' 'single'} OutputAsDlarray: [0 0] MiniBatchFormat: {'' ''} OutputEnvironment: {'auto' 'auto'}
Obtain a mini-batch and display the average of the images in the mini-batch.
[X,averageImage] = next(mbq); imshow(averageImage)
function [X,averageImage] = preprocessMiniBatch(XCell) X = cat(4,XCell{:}); X = rescale(X,"InputMin",0,"InputMax",255); averageImage = mean(X,4); end
minibatchqueue
in a Custom Training LoopTrain a network using minibatchqueue to manage the processing of mini-batches.
Load Training Data
Load the digits training data and store the data in a datastore. Create a datastore for the images and one for the labels using arrayDatastore
. Then, combine the datastores to produce a single datastore to use with minibatchqueue
.
[XTrain,YTrain] = digitTrain4DArrayData;
dsX = arrayDatastore(XTrain,'IterationDimension',4);
dsY = arrayDatastore(YTrain);
dsTrain = combine(dsX,dsY);
Determine the number of unique classes in the label data.
classes = categories(YTrain); numClasses = numel(classes);
Define Network
Define the network and specify the average image value using the 'Mean'
option in the image input layer.
layers = [ imageInputLayer([28 28 1], 'Name','input','Mean',mean(XTrain,4)) convolution2dLayer(5,20,'Name','conv1') reluLayer('Name', 'relu1') convolution2dLayer(3,20,'Padding',1,'Name','conv2') reluLayer('Name','relu2') convolution2dLayer(3,20,'Padding',1,'Name','conv3') reluLayer('Name','relu3') fullyConnectedLayer(numClasses,'Name','fc') softmaxLayer('Name','softmax')]; lgraph = layerGraph(layers);
Create a dlnetwork
object from the layer graph.
dlnet = dlnetwork(lgraph);
Define Model Gradients Function
Create the helper function modelGradients
, listed at the end of the example. The function takes a dlnetwork
object dlnet
and a mini-batch of input data dlX
with corresponding labels Y,
and returns the loss and the gradients of the loss with respect to the learnable parameters in dlnet
.
Specify Training Options
Specify the options to use during training.
numEpochs = 10; miniBatchSize = 128;
Visualize the training progress in a plot.
plots = "training-progress";
Create the minibatchqueue
Use minibatchqueue
to process and manage the mini-batches of images. For each mini-batch:
Discard partial mini-batches.
Use the custom mini-batch preprocessing function preprocessMiniBatch
(defined at the end of this example) to one-hot encode the class labels.
Format the image data with the dimension labels 'SSCB'
(spatial, spatial, channel, batch). By default, the minibatchqueue
object converts the data to dlarray
objects with underlying type single
. Do not add a format to the class labels.
Train on a GPU if one is available. By default, the minibatchqueue
object converts each output to a gpuArray
if a GPU is available. Using a GPU requires Parallel Computing Toolbox™ and a CUDA® enabled NVIDIA® GPU with compute capability 3.0 or higher.
mbq = minibatchqueue(dsTrain,... 'MiniBatchSize',miniBatchSize,... 'PartialMiniBatch','discard',... 'MiniBatchFcn',@preprocessMiniBatch,... 'MiniBatchFormat',{'SSCB',''});
Train Network
Train the model using a custom training loop. For each epoch, shuffle the data and loop over mini-batches while data is still available in the minibatchqueue
. Update the network parameters using the adamupdate
function. At the end of each epoch, display the training progress.
Initialize the training progress plot.
if plots == "training-progress" figure lineLossTrain = animatedline('Color',[0.85 0.325 0.098]); ylim([0 inf]) xlabel("Iteration") ylabel("Loss") grid on end
Initialize the average gradients and squared average gradients.
averageGrad = []; averageSqGrad = [];
Train the network.
iteration = 0; start = tic; for epoch = 1:numEpochs % Shuffle data. shuffle (mbq); while hasdata(mbq) iteration = iteration + 1; % Read mini-batch of data [dlX,Y] = next(mbq); % Evaluate the model gradients and loss using dlfeval and the % modelGradients helper function. [grad,loss] = dlfeval(@modelGradients,dlnet,dlX,Y); % Update the network parameters using the Adam optimizer. [dlnet,averageGrad,averageSqGrad] = adamupdate(dlnet,grad,averageGrad,averageSqGrad,iteration); % Display the training progress. if plots == "training-progress" D = duration(0,0,toc(start),'Format','hh:mm:ss'); addpoints(lineLossTrain,iteration,double(gather(extractdata(loss)))) title("Epoch: " + epoch + ", Elapsed: " + string(D)) drawnow end end end
Model Gradients Function
The modelGradients
helper function takes a dlnetwork
object dlnet
and a mini-batch of input data dlX
with corresponding labels Y
, and returns the loss and the gradients of the loss with respect to the learnable parameters in dlnet
. To compute the gradients automatically, use the dlgradient
function.
function [gradients,loss] = modelGradients(dlnet,dlX,Y) dlYPred = forward(dlnet,dlX); loss = crossentropy(dlYPred,Y); gradients = dlgradient(loss,dlnet.Learnables); end
Mini-Batch Preprocessing Function
The preprocessMiniBatch
function preprocesses the data using the following steps:
Extract the image data from the incoming cell array and concatenate into a numeric array. Concatenating the image data over the fourth dimension adds a third dimension to each image, to be used as a singleton channel dimension.
Extract the label data from the incoming cell array and concatenate along the second dimension into a categorical array.
One-hot encode the categorical labels into numeric arrays. Encoding into the first dimension produces an encoded array that matches the shape of the network output.
function [X,Y] = preprocessMiniBatch(XCell,YCell) % Extract image data from cell and concatenate over 4th dimension to adds a % singleton dimension 3 for channel dimension X = cat(4,XCell{:}); % Extract label data from cell and concatenate Y = cat(2,YCell{:}); % One-hot encode labels Y = onehotencode(Y,1); end
You have a modified version of this example. Do you want to open this example with your edits?