Gated recurrent unit
The gated recurrent unit (GRU) operation allows a network to learn dependencies between time steps in time series and sequence data.
Note
This function applies the deep learning GRU operation to dlarray
data. If
you want to apply an GRU operation within a layerGraph
object
or Layer
array, use
the following layer:
applies a gated recurrent unit (GRU) calculation to input dlY
= gru(dlX
,H0
,weights
,recurrentWeights
,bias
)dlX
using the
initial hidden state H0
, and parameters weights
,
recurrentWeights
, and bias
. The input
dlX
is a formatted dlarray
with dimension labels.
The output dlY
is a formatted dlarray
with the same
dimension labels as dlX
, except for any 'S'
dimensions.
The gru
function updates the hidden state using the hyperbolic
tangent function (tanh) as the state activation function. The gru
function uses the sigmoid function given by as the gate activation function.
[
also returns the hidden state after the GRU operation.dlY
,hiddenState
] = gru(dlX
,H0
,weights
,recurrentWeights
,bias
)
[___] = gru(___,'DataFormat',
also specifies the dimension format FMT
)FMT
when dlX
is
not a formatted dlarray
. The output dlY
is an
unformatted dlarray
with the same dimension order as
dlX
, except for any 'S'
dimensions.
functionToLayerGraph
does not support the gru
function.
If you use functionToLayerGraph
with a function that contains the
gru
operation, the resulting LayerGraph
contains
placeholder layers.
[1] Cho, Kyunghyun, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).
dlarray
| dlfeval
| dlgradient
| fullyconnect
| lstm
| softmax