crossval

Cross-validate support vector machine (SVM) classifier

Description

example

CVSVMModel = crossval(SVMModel) returns a cross-validated (partitioned) support vector machine (SVM) classifier (CVSVMModel) from a trained SVM classifier (SVMModel). By default, crossval uses 10-fold cross-validation on the training data to create CVSVMModel, a ClassificationPartitionedModel classifier.

example

CVSVMModel = crossval(SVMModel,Name,Value) returns a partitioned SVM classifier with additional options specified by one or more name-value pair arguments. For example, you can specify the number of folds or holdout sample proportion.

Examples

collapse all

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM classifier. Standardize the predictor data and specify the order of the classes.

SVMModel = fitcsvm(X,Y,'Standardize',true,'ClassNames',{'b','g'});

SVMModel is a trained ClassificationSVM classifier. 'b' is the negative class and 'g' is the positive class.

Cross-validate the classifier using 10-fold cross-validation.

CVSVMModel = crossval(SVMModel)
CVSVMModel = 
  ClassificationPartitionedModel
    CrossValidatedModel: 'SVM'
         PredictorNames: {1x34 cell}
           ResponseName: 'Y'
        NumObservations: 351
                  KFold: 10
              Partition: [1x1 cvpartition]
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'


  Properties, Methods

FirstModel = CVSVMModel.Trained{1}
FirstModel = 
  CompactClassificationSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
                    Alpha: [78x1 double]
                     Bias: -0.2209
         KernelParameters: [1x1 struct]
                       Mu: [1x34 double]
                    Sigma: [1x34 double]
           SupportVectors: [78x34 double]
      SupportVectorLabels: [78x1 double]


  Properties, Methods

CVSVMModel is a ClassificationPartitionedModel cross-validated classifier. During cross-validation, the software completes these steps:

  1. Randomly partition the data into 10 sets of equal size.

  2. Train an SVM classifier on nine of the sets.

  3. Repeat steps 1 and 2 k = 10 times. The software leaves out one partition each time and trains on the other nine partitions.

  4. Combine generalization statistics for each fold.

FirstModel is the first of the 10 trained classifiers. It is a CompactClassificationSVM classifier.

You can estimate the generalization error by passing CVSVMModel to kfoldLoss.

Specify a holdout sample proportion for cross-validation. By default, crossval uses 10-fold cross-validation to cross-validate an SVM classifier. However, you have several other options for cross-validation. For example, you can specify a different number of folds or holdout sample proportion.

Load the ionosphere data set.

load ionosphere
rng(1); % For reproducibility

Train an SVM classifier. Standardize the data and specify that 'g' is the positive class.

SVMModel = fitcsvm(X,Y,'Standardize',true,'ClassNames',{'b','g'});

SVMModel is a trained ClassificationSVM classifier.

Cross-validate the classifier by specifying a 15% holdout sample.

CVSVMModel = crossval(SVMModel,'Holdout',0.15)
CVSVMModel = 
  ClassificationPartitionedModel
    CrossValidatedModel: 'SVM'
         PredictorNames: {1x34 cell}
           ResponseName: 'Y'
        NumObservations: 351
                  KFold: 1
              Partition: [1x1 cvpartition]
             ClassNames: {'b'  'g'}
         ScoreTransform: 'none'


  Properties, Methods

CVSVMModel is a ClassificationPartitionedModel.

Display properties of the classifier trained using 85% of the data.

TrainedModel = CVSVMModel.Trained{1}
TrainedModel = 
  CompactClassificationSVM
             ResponseName: 'Y'
    CategoricalPredictors: []
               ClassNames: {'b'  'g'}
           ScoreTransform: 'none'
                    Alpha: [74x1 double]
                     Bias: -0.2952
         KernelParameters: [1x1 struct]
                       Mu: [1x34 double]
                    Sigma: [1x34 double]
           SupportVectors: [74x34 double]
      SupportVectorLabels: [74x1 double]


  Properties, Methods

TrainedModel is a CompactClassificationSVM classifier trained using 85% of the data.

Estimate the generalization error.

kfoldLoss(CVSVMModel)
ans = 0.0769

The out-of-sample misclassification error is approximately 8%.

Input Arguments

collapse all

Full, trained SVM classifier, specified as a ClassificationSVM model trained with fitcsvm.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: crossval(SVMModel,'KFold',5) specifies using five folds in a cross-validated model.

Cross-validation partition, specified as the comma-separated pair consisting of 'CVPartition' and a cvpartition partition object created by cvpartition. The partition object specifies the type of cross-validation and the indexing for the training and validation sets.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: Suppose you create a random partition for 5-fold cross-validation on 500 observations by using cvp = cvpartition(500,'KFold',5). Then, you can specify the cross-validated model by using 'CVPartition',cvp.

Fraction of the data used for holdout validation, specified as the comma-separated pair consisting of 'Holdout' and a scalar value in the range (0,1). If you specify 'Holdout',p, then the software completes these steps:

  1. Randomly select and reserve p*100% of the data as validation data, and train the model using the rest of the data.

  2. Store the compact, trained model in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'Holdout',0.1

Data Types: double | single

Number of folds to use in a cross-validated model, specified as the comma-separated pair consisting of 'KFold' and a positive integer value greater than 1. If you specify 'KFold',k, then the software completes these steps:

  1. Randomly partition the data into k sets.

  2. For each set, reserve the set as validation data, and train the model using the other k – 1 sets.

  3. Store the k compact, trained models in the cells of a k-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'KFold',5

Data Types: single | double

Leave-one-out cross-validation flag, specified as the comma-separated pair consisting of 'Leaveout' and 'on' or 'off'. If you specify 'Leaveout','on', then, for each of the n observations (where n is the number of observations excluding missing observations, specified in the NumObservations property of the model), the software completes these steps:

  1. Reserve the observation as validation data, and train the model using the other n – 1 observations.

  2. Store the n compact, trained models in the cells of an n-by-1 cell vector in the Trained property of the cross-validated model.

To create a cross-validated model, you can use one of these four name-value pair arguments only: CVPartition, Holdout, KFold, or Leaveout.

Example: 'Leaveout','on'

Tips

Assess the predictive performance of SVMModel on cross-validated data by using the “kfold” methods and properties of CVSVMModel, such as kfoldLoss.

Alternative Functionality

Instead of training an SVM classifier and then cross-validating it, you can create a cross-validated classifier directly by using fitcsvm and specifying any of these name-value pair arguments: 'CrossVal', 'CVPartition', 'Holdout', 'Leaveout', or 'KFold'.

Introduced in R2014a