Compute gradients for custom training loops using automatic differentiation
Use dlgradient
to compute derivatives using automatic
differentiation for custom training loops.
Tip
For most deep learning tasks, you can use a pretrained network and adapt it to your own data. For an example showing how to use transfer learning to retrain a convolutional neural network to classify a new set of images, see Train Deep Learning Network to Classify New Images. Alternatively, you can create and train networks from scratch using layerGraph
objects with the trainNetwork
and trainingOptions
functions.
If the trainingOptions
function does not provide the training options that you need for your task, then you can create a custom training loop using automatic differentiation. To learn more, see Define Deep Learning Network for Custom Training Loops.
[
returns the gradients of dydx1,...,dydxk
] = dlgradient(y
,x1,...,xk
)y
with respect to the variables
x1
through xk
.
Call dlgradient
from inside a function passed to
dlfeval
. See Compute Gradient Using Automatic Differentiation and Use Automatic Differentiation In Deep Learning Toolbox.
[
causes the gradient to retain intermediate values for reuse in subsequent
dydx1,...,dydxk
] = dlgradient(y
,x1,...,xk
,'RetainData'
,true)dlgradient
calls. This syntax can save time, but uses more memory. See
Tips.
dlgradient
does not support higher order derivatives. In other
words, you cannot pass the output of a dlgradient
call into another
dlgradient
call.
A dlgradient
call must be inside a function. To obtain a numeric
value of a gradient, you must evaluate the function using dlfeval
,
and the argument to the function must be a dlarray
. See Use Automatic Differentiation In Deep Learning Toolbox.
To enable the correct evaluation of gradients, the y
argument
must use only supported functions for dlarray
. See List of Functions with dlarray Support.
If you set the 'RetainData'
name-value pair argument to
true
, the software preserves tracing for the duration of the
dlfeval
function call instead of erasing the trace immediately
after the derivative computation. This preservation can cause a subsequent
dlgradient
call within the same dlfeval
call
to be executed faster, but uses more memory. For example, in training an adversarial
network, the 'RetainData'
setting is useful because the two networks
share data and functions during training. See Train Generative Adversarial Network (GAN).