crossval

Cross-validated k-nearest neighbor classifier

Description

example

cvmodel = crossval(mdl) creates a cross-validated (partitioned) model from mdl, a fitted KNN classification model. By default, crossval uses 10-fold cross-validation on the training data to create cvmodel, a ClassificationPartitionedModel object.

cvmodel = crossval(mdl,Name,Value) creates a partitioned model with additional options specified by one or more name-value pair arguments. For example, specify 'Leaveout','on' for leave-one-out cross-validation.

Examples

collapse all

Create a cross-validated k-nearest neighbor model, and assess classification performance using the model.

Load the Fisher iris data set.

load fisheriris
X = meas;
Y = species;

Create a classifier for nearest neighbors.

mdl = fitcknn(X,Y);

Create a cross-validated classifier.

cvmdl = crossval(mdl)
cvmdl = 
  ClassificationPartitionedModel
    CrossValidatedModel: 'KNN'
         PredictorNames: {'x1'  'x2'  'x3'  'x4'}
           ResponseName: 'Y'
        NumObservations: 150
                  KFold: 10
              Partition: [1x1 cvpartition]
             ClassNames: {'setosa'  'versicolor'  'virginica'}
         ScoreTransform: 'none'


  Properties, Methods

Find the cross-validated loss of the classifier.

cvmdlloss = kfoldLoss(cvmdl)
cvmdlloss = 0.0467

The cross-validated loss is less than 5%. You can expect mdl to have a similar error rate.

Input Arguments

collapse all

k-nearest neighbor classifier model, specified as a ClassificationKNN object.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: crossval(mdl,'KFold',5) creates a partitioned model with 5-fold cross-validation.

Cross-validation partition, specified as the comma-separated pair consisting of 'CVPartition' and a cvpartition object created by the cvpartition function. crossval splits the data into subsets with cvpartition.

Use only one of these four options at a time: 'CVPartition', 'Holdout', 'KFold', or 'Leaveout'.

Fraction of the data used for holdout validation, specified as the comma-separated pair consisting of 'Holdout' and a scalar value in the range (0,1).

Use only one of these four options at a time: 'CVPartition', 'Holdout', 'KFold', or 'Leaveout'.

Example: 'Holdout',0.3

Data Types: single | double

Number of folds to use in a cross-validated model, specified as the comma-separated pair consisting of 'KFold' and a positive integer value greater than 1.

Use only one of these four options at a time: 'CVPartition', 'Holdout', 'KFold', or 'Leaveout'.

Example: 'KFold',3

Data Types: single | double

Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of 'Leaveout' and 'on' or 'off'. Leave-one-out is a special case of 'KFold' in which the number of folds equals the number of observations.

Use only one of these four options at a time: 'CVPartition', 'Holdout', 'KFold', or 'Leaveout'.

Example: 'Leaveout','on'

Tips

  • Assess the predictive performance of mdl on cross-validated data by using the “kfold” methods and properties of cvmodel, such as kfoldLoss.

Alternative Functionality

You can create a cross-validated model directly from the data instead of creating a model followed by a cross-validated model. To do so, specify one of these options in fitcknn: 'CrossVal', 'KFold', 'Holdout', 'Leaveout', or 'CVPartition'.

Introduced in R2012a