Group normalization layer
A group normalization layer divides the channels of the input data into groups and normalizes the activations across each group. To speed up training of convolutional neural networks and reduce the sensitivity to network initialization, use group normalization layers between convolutional layers and nonlinearities, such as ReLU layers. You can perform instance normalization and layer normalization by setting the appropriate number of groups.
You can use a group normalization layer in place of a batch normalization layer. This is particularly useful when training with small batch sizes as it can increase the stability of training.
The layer first normalizes the activations of each group by subtracting the group mean and dividing by the group standard deviation. Then, the layer shifts the input by a learnable offset β and scales it by a learnable scale factor γ.
creates a group normalization layer that divides the channels in the layer input into
layer
= groupNormalizationLayer(numGroups
)numGroups
groups and normalizes across each group.
creates a group normalization layer and sets the optional Normalization, Parameters and Initialization, Learn Rate and Regularization, and layer
= groupNormalizationLayer(numGroups
,Name,Value
)Name
properties using one or more name-value pair arguments.
You can specify multiple name-value pair arguments. Enclose each property name in
quotes.
A group normalization normalizes its inputs xi by first calculating the mean μg and variance σg2 over the specified groups of channels. Then, it calculates the normalized activations as
Here, ϵ (the property Epsilon
) improves numerical
stability when the group variance is very small. To allow for the possibility that inputs with
zero mean and unit variance are not optimal for the layer that follows the group normalization
layer, the group normalization layer further shifts and scales the activations as
Here, the offset β and scale factor γ
(Offset
and Scale
properties) are learnable
parameters that are updated during network training.
[1] Wu, Yuxin, and Kaiming He. “Group Normalization.” ArXiv:1803.08494 [Cs], June 11, 2018. http://arxiv.org/abs/1803.08494.
batchNormalizationLayer
| convolution2dLayer
| fullyConnectedLayer
| reluLayer
| trainingOptions
| trainNetwork