templateECOC

Error-correcting output codes learner template

collapse all in page

Syntax

t = templateECOC()

t = templateECOC(Name,Value)

Description

example

t = templateECOC() returns an error-correcting output codes (ECOC) classification learner template.

If you specify a default template, then the software uses default values for all input arguments during training.

example

t = templateECOC(Name,Value) returns a template with additional options specified by one or more name-value pair arguments.

For example, you can specify a coding design, whether to fit posterior probabilities, or the types of binary learners.

If you display t in the Command Window, then all options appear empty ([]), except those that you specify using name-value pair arguments. During training, the software uses default values for empty options.

Examples

collapse all

Create a Default ECOC Classification Learner Template

Open Live Script

Use templateECOC to create a default ECOC template.

t = templateECOC()

t = 
Fit template for classification ECOC.

    BinaryLearners: ''
            Coding: ''
      FitPosterior: []
           Options: []
    VerbosityLevel: []
     NumConcurrent: []
           Version: 1
            Method: 'ECOC'
              Type: 'classification'

All properties of the template object are empty except for Method and Type. When you pass t to testckfold, the software fills in the empty properties with their respective default values. For example, the software fills the BinaryLearners property with 'SVM'. For details on other default values, see fitcecoc.

t is a plan for an ECOC learner. When you create it, no computation occurs. You can pass t to testckfold to specify a plan for an ECOC classification model to statistically compare with another model.

Statistically Compare Performance of Two ECOC Classification Models

Open Live Script

One way to select predictors or features is to train two models where one that uses a subset of the predictors that trained the other. Statistically compare the predictive performances of the models. If there is sufficient evidence that model trained on fewer predictors performs better than the model trained using more of the predictors, then you can proceed with a more efficient model.

Load Fisher's iris data set. Plot all 2-dimensional combinations of predictors.

load fisheriris
d = size(meas,2); % Number of predictors
pairs = nchoosek(1:d,2)

pairs = 6×2

     1     2
     1     3
     1     4
     2     3
     2     4
     3     4

for j = 1:size(pairs,1)
    subplot(3,2,j)
    gscatter(meas(:,pairs(j,1)),meas(:,pairs(j,2)),species)
    xlabel(sprintf('meas(:,%d)',pairs(j,1)))
    ylabel(sprintf('meas(:,%d)',pairs(j,2)))
    legend off
end

Based on the scatterplot, meas(:,3) and meas(:,4) seem like they separate the groups well.

Create an ECOC template. Specify to use a one-versus-all coding design.

t = templateECOC('Coding','onevsall');

By default, the ECOC model uses linear SVM binary learners. You can choose other, supported algorithms by specifying them using the 'Learners' name-value pair argument.

Test whether an ECOC model that is just trained using predictors 3 and 4 performs at most as well as an ECOC model that is trained using all predictors. Rejecting this null hypothesis means that the ECOC model trained using predictors 3 and 4 performs better than the ECOC model trained using all predictors. Suppose $C_{1}$ represents the classification error of the ECOC model trained using predictors 3 and 4 and $C_{2}$ represents the classification error of the ECOC model trained using all predictors, then the test is:

$\begin{array}{l} H_{0} : C_{1} \geq C_{2} \\ H_{1} : C_{1} < C_{2} \end{array}$

By default, testckfold conducts a 5-by-2 k-fold F test, which is not appropriate as a one-tailed test. Specify to conduct a 5-by-2 k-fold t test.

rng(1); % For reproducibility
[h,pValue] = testckfold(t,t,meas(:,pairs(6,:)),meas,species,...
    'Alternative','greater','Test','5x2t')

h = logical
   0

pValue = 0.8940

The h = 0 indicates that there is not enough evidence to suggest that the model trained using predictors 3 and 4 is more accurate than the model trained using all predictors.

Input Arguments

collapse all

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'Coding','ternarycomplete','FitPosterior',true,'Learners','tree' specifies a ternary complete coding design, to transform scores to posterior probabilities, and to grow classification trees for all binary learners.

`'Coding'` — Coding design
`'onevsone'` (default) | `'allpairs'` | `'binarycomplete'` | `'denserandom'` | `'onevsall'` | `'ordinal'` | `'sparserandom'` | `'ternarycomplete'` | numeric matrix

Coding design name, specified as the comma-separated pair consisting of 'Coding' and a numeric matrix or a value in this table.

Value	Number of Binary Learners	Description
`'allpairs'` and `'onevsone'`	K(K – 1)/2	For each binary learner, one class is positive, another is negative, and the software ignores the rest. This design exhausts all combinations of class pair assignments.
`'binarycomplete'`	$2^{(K - 1)} - 1$	This design partitions the classes into all binary combinations, and does not ignore any classes. For each binary learner, all class assignments are `-1` and `1` with at least one positive and negative class in the assignment.
`'denserandom'`	Random, but approximately 10 log₂K	For each binary learner, the software randomly assigns classes into positive or negative classes, with at least one of each type. For more details, see Random Coding Design Matrices.
`'onevsall'`	K	For each binary learner, one class is positive and the rest are negative. This design exhausts all combinations of positive class assignments.
`'ordinal'`	K – 1	For the first binary learner, the first class is negative, and the rest positive. For the second binary learner, the first two classes are negative, the rest positive, and so on.
`'sparserandom'`	Random, but approximately 15 log₂K	For each binary learner, the software randomly assigns classes as positive or negative with probability 0.25 for each, and ignores classes with probability 0.5. For more details, see Random Coding Design Matrices.
`'ternarycomplete'`	$(3^{K} - 2^{(K + 1)} + 1) / 2$	This design partitions the classes into all ternary combinations. All class assignments are `0`, `-1`, and `1` with at least one positive and one negative class in the assignment.

You can also specify a coding design using a custom coding matrix. The custom coding matrix is a K-by-L matrix. Each row corresponds to a class and each column corresponds to a binary learner. The class order (rows) corresponds to the order in ClassNames. Compose the matrix by following these guidelines:

Every element of the custom coding matrix must be -1, 0, or 1, and the value must correspond to a dichotomous class assignment. This table describes the meaning of Coding(i,j), that is, the class that learner j assigns to observations in class i.

Value	Dichotomous Class Assignment
`–1`	Learner `j` assigns observations in class `i` to a negative class.
`0`	Before training, learner `j` removes observations in class `i` from the data set.
`1`	Learner `j` assigns observations in class `i` to a positive class.

Every column must contain at least one -1 or 1.
For all column indices i,j such that i ≠ j, Coding(:,i) cannot equal Coding(:,j) and Coding(:,i) cannot equal -Coding(:,j).
All rows of the custom coding matrix must be different.

For more details on the form of custom coding design matrices, see Custom Coding Design Matrices.

Example: 'Coding','ternarycomplete'

Data Types: char | string | double | single | int16 | int32 | int64 | int8

`'FitPosterior'` — Flag indicating whether to transform scores to posterior probabilities
`false` or `0` (default) | `true` or `1`

Flag indicating whether to transform scores to posterior probabilities, specified as the comma-separated pair consisting of 'FitPosterior' and a true (1) or false (0).

If FitPosterior is true, then the software transforms binary-learner classification scores to posterior probabilities. You can obtain posterior probabilities by using kfoldPredict, predict, or resubPredict.

fitcecoc does not support fitting posterior probabilities if:

The ensemble method is AdaBoostM2, LPBoost, RUSBoost, RobustBoost, or TotalBoost.
The binary learners (Learners) are linear or kernel classification models that implement SVM. To obtain posterior probabilities for linear or kernel classification models, implement logistic regression instead.

Example: 'FitPosterior',true

Data Types: logical

`'Learners'` — Binary learner templates
`'svm'` (default) | `'discriminant'` | `'kernel'` | `'knn'` | `'linear'` | `'naivebayes'` | `'tree'` | template object | cell vector of template objects

Binary learner templates, specified as the comma-separated pair consisting of 'Learners' and a character vector, string scalar, template object, or cell vector of template objects. Specifically, you can specify binary classifiers such as SVM, and the ensembles that use GentleBoost, LogitBoost, and RobustBoost, to solve multiclass problems. However, fitcecoc also supports multiclass models as binary classifiers.

If Learners is a character vector or string scalar, then the software trains each binary learner using the default values of the specified algorithm. This table summarizes the available algorithms.

Value	Description
`'discriminant'`	Discriminant analysis. For default options, see `templateDiscriminant`.
`'kernel'`	Kernel classification model. For default options, see `templateKernel`.
`'knn'`	k-nearest neighbors. For default options, see `templateKNN`.
`'linear'`	Linear classification model. For default options, see `templateLinear`.
`'naivebayes'`	Naive Bayes. For default options, see `templateNaiveBayes`.
`'svm'`	SVM. For default options, see `templateSVM`.
`'tree'`	Classification trees. For default options, see `templateTree`.

If Learners is a template object, then each binary learner trains according to the stored options. You can create a template object using:
- templateDiscriminant, for discriminant analysis.
- templateEnsemble, for ensemble learning. You must at least specify the learning method (Method), the number of learners (NLearn), and the type of learner (Learners). You cannot use the AdaBoostM2 ensemble method for binary learning.
- templateKernel, for kernel classification.
- templateKNN, for k-nearest neighbors.
- templateLinear, for linear classification.
- templateNaiveBayes, for naive Bayes.
- templateSVM, for SVM.
- templateTree, for classification trees.
If Learners is a cell vector of template objects, then:
- Cell j corresponds to binary learner j (in other words, column j of the coding design matrix), and the cell vector must have length L. L is the number of columns in the coding design matrix. For details, see Coding.
- To use one of the built-in loss functions for prediction, then all binary learners must return a score in the same range. For example, you cannot include default SVM binary learners with default naive Bayes binary learners. The former returns a score in the range (-∞,∞), and the latter returns a posterior probability as a score. Otherwise, you must provide a custom loss as a function handle to functions such as predict and loss.
- You cannot specify linear classification model learner templates with any other template.
- Similarly, you cannot specify kernel classification model learner templates with any other template.

By default, the software trains learners using default SVM templates.

Example: 'Learners','tree'

Output Arguments

collapse all

`t` — ECOC classification template
template object

ECOC classification template, returned as a template object. Pass t to testckfold to specify how to create an ECOC classifier whose predictive performance you want to compare with another classifier.

If you display t to the Command Window, then all, unspecified options appear empty ([]). However, the software replaces empty options with their corresponding default values during training.

Documentation

templateECOC

Syntax

Description

Examples

Create a Default ECOC Classification Learner Template

Statistically Compare Performance of Two ECOC Classification Models

Input Arguments

Name-Value Pair Arguments

`'Coding'` — Coding design
`'onevsone'` (default) | `'allpairs'` | `'binarycomplete'` | `'denserandom'` | `'onevsall'` | `'ordinal'` | `'sparserandom'` | `'ternarycomplete'` | numeric matrix

`'FitPosterior'` — Flag indicating whether to transform scores to posterior probabilities
`false` or `0` (default) | `true` or `1`

`'Learners'` — Binary learner templates
`'svm'` (default) | `'discriminant'` | `'kernel'` | `'knn'` | `'linear'` | `'naivebayes'` | `'tree'` | template object | cell vector of template objects

Output Arguments

`t` — ECOC classification template
template object

See Also

Statistics and Machine Learning Toolbox Documentation

Support

Documentation

templateECOC

Syntax

Description

Examples

Create a Default ECOC Classification Learner Template

Statistically Compare Performance of Two ECOC Classification Models

Input Arguments

Name-Value Pair Arguments

'Coding' — Coding design 'onevsone' (default) | 'allpairs' | 'binarycomplete' | 'denserandom' | 'onevsall' | 'ordinal' | 'sparserandom' | 'ternarycomplete' | numeric matrix

'FitPosterior' — Flag indicating whether to transform scores to posterior probabilities false or 0 (default) | true or 1

'Learners' — Binary learner templates 'svm' (default) | 'discriminant' | 'kernel' | 'knn' | 'linear' | 'naivebayes' | 'tree' | template object | cell vector of template objects

Output Arguments

t — ECOC classification template template object

See Also

Statistics and Machine Learning Toolbox Documentation

Support

`'Coding'` — Coding design
`'onevsone'` (default) | `'allpairs'` | `'binarycomplete'` | `'denserandom'` | `'onevsall'` | `'ordinal'` | `'sparserandom'` | `'ternarycomplete'` | numeric matrix

`'FitPosterior'` — Flag indicating whether to transform scores to posterior probabilities
`false` or `0` (default) | `true` or `1`

`'Learners'` — Binary learner templates
`'svm'` (default) | `'discriminant'` | `'kernel'` | `'knn'` | `'linear'` | `'naivebayes'` | `'tree'` | template object | cell vector of template objects

`t` — ECOC classification template
template object