crossentropy

Cross-entropy loss for classification tasks

Description

The cross-entropy operation computes the cross-entropy loss between network predictions and target values for single-label and multi-label classification tasks.

Note

This function computes the cross-entropy loss between predictions and targets stored as dlarray data. If you want to calculate the cross-entropy loss within a layerGraph object or Layer array for use with trainNetwork, use the following layer:

example

dlY = crossentropy(dlX,targets) computes the categorical cross-entropy loss between the predictions dlX and the target values targets for single-label classification tasks. The input dlX is a formatted dlarray with dimension labels. The output dlY is an unformatted scalar dlarray with no dimension labels.

dlY = crossentropy(dlX,targets,'DataFormat',FMT) also specifies the dimension format FMT when dlX is not a formatted dlarray.

dlY = crossentropy(___,Name,Value) specifies options using one or more name-value pair arguments in addition to the input arguments in previous syntaxes. For example, 'TargetCategories','independent' computes the cross-entropy loss for a multi-label classification task.

Examples

collapse all

The cross-entropy loss evaluates how well the network predictions correspond to the target classification.

Create the input classification data as a matrix of random variables. The data contains 12 observations that can be in any of 10 categories.

numCategories = 10;
observations = 12;

X = rand(numCategories,observations);
dlX = dlarray(X,'CB');

Convert the category values in the data to probability scores for each category.

dlX = softmax(dlX);

Create the target data, which holds the correct category for each observation in dlX.

targetsIdx = randi(10,1,12);
targets = zeros(10,12);
for i = 1:numel(targetsIdx)
    targets(targetsIdx(i),i) = 1;
end

Compute the cross-entropy loss between the predictions and the targets.

dlY = crossentropy(dlX,targets)
dlY = 
  1x1 dlarray

    2.3343

Input Arguments

collapse all

Predictions, specified as a dlarray with or without dimension labels or a numeric array. When dlX is not a formatted dlarray, you must specify the dimension format using 'DataFormat',FMT. If dlX is a numeric array, targets must be a dlarray.

Data Types: single | double

Target classification labels, specified as a formatted or unformatted dlarray or a numeric array.

If targets is a formatted dlarray, its dimension format must be the same as the format of X, or the same as 'DataFormat' if X is unformatted

If targets is an unformatted dlarray or a numeric array, the size of targets must exactly match the size of X. The format of X or the value of 'DataFormat' is implicitly applied to targets.

Data Types: single | double

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'TargetCategories','independent','DataFormat','CB' evaluates the cross-entropy loss for multi-label classification tasks and specifies the dimension order of the input data as 'CB'

Type of classification task, specified as the comma-separated pair consisting of 'TargetCategories' and one of the following:

  • 'exclusive' — Single-label classification. Each observation in the predictions dlX is exclusively assigned to one category. The function computes the loss between the target value for the single category specified by targets and the corresponding prediction in dlX, averaged over the number of observations.

  • 'independent'— Multi-label classification. Each observation in the predictions dlX can be assigned to one or more independent categories. The function computes the sum of the loss between each category specified by targets and the predictions in dlx for those categories, averaged over the number of observations. Cross-entropy loss for this type of classification task is also known as binary cross-entropy loss.

The default value is 'exclusive'.

For single-label classification, the loss is calculated using the following formula:

loss=1Ni=1MTilog(Xi)

where Xi is the network response, Ti is the target value, M is the total number of responses in X (across all observations and categories), and N is the total number of observations in X.

For multi-label classification, the loss is calculated using the following formula:

loss=1Ni=1Nj=1C(Ti,jlog(Xi,j)+(1Ti,j)log(1Xi,j))

where here Xi,j is the network response for a given category, Ti,j is the target value of that category, and C is the total number of categories. In this case, the cross-entropy loss is calculated as the probability of a given observation being assigned to a given category, summed over all categories and observations and normalized by the number of observations.

Example: 'TargetCategories','independent'

Dimension order of unformatted input data, specified as the comma-separated pair consisting of 'DataFormat' and a character array or string FMT that provides a label for each dimension of the data. Each character in FMT must be one of the following:

  • 'S' — Spatial

  • 'C' — Channel

  • 'B' — Batch (for example, samples and observations)

  • 'T' — Time (for example, sequences)

  • 'U' — Unspecified

You can specify multiple dimensions labeled 'S' or 'U'. You can use the labels 'C', 'B', and 'T' at most once.

You must specify 'DataFormat' when the input data dlX is not a formatted dlarray.

Example: 'DataFormat','SSCB'

Data Types: char | string

Output Arguments

collapse all

Cross-entropy loss, returned as a dlarray scalar without dimension labels. The output dlY has the same underlying data type as the input dlX.

The cross-entropy loss dlY is the average logarithmic loss across the 'B' batch dimension of dlX.

More About

collapse all

Cross-Entropy Loss

The crossentropy function computes the cross-entropy loss for classification problems. For more information, see the definition of Classification Output Layer on the ClassificationOutputLayer reference page.

Extended Capabilities

Introduced in R2019b