crossval

Class: RegressionTree

Cross-validated decision tree

Syntax

cvmodel = crossval(model)
cvmodel = crossval(model,Name,Value)

Description

cvmodel = crossval(model) creates a partitioned model from model, a fitted regression tree. By default, crossval uses 10-fold cross validation on the training data to create cvmodel.

cvmodel = crossval(model,Name,Value) creates a partitioned model with additional options specified by one or more Name,Value pair arguments.

Input Arguments

model

A regression model, produced using fitrtree.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

'CVPartition'

Object of class cvpartition, created by the cvpartition function. crossval splits the data into subsets with cvpartition.

Use only one of these four options at a time: 'KFold', 'Holdout', 'Leaveout', or 'CVPartition'.

Default: []

'Holdout'

Holdout validation tests the specified fraction of the data, and uses the rest of the data for training. Specify a numeric scalar from 0 to 1. You can only use one of these four options at a time for creating a cross-validated tree: 'KFold', 'Holdout', 'Leaveout', or 'CVPartition'.

'KFold'

Number of folds to use in a cross-validated tree, a positive integer value greater than 1.

Use only one of these four options at a time: 'KFold', 'Holdout', 'Leaveout', or 'CVPartition'.

Default: 10

'Leaveout'

Set to 'on' for leave-one-out cross-validation.

Output Arguments

cvmodel

A partitioned model of class RegressionPartitionedModel.

Examples

expand all

Load the carsmall data set. Consider Acceleration, Displacement, Horsepower, and Weight as predictor variables.

load carsmall
X = [Acceleration Displacement Horsepower Weight];

Grow a regression tree using the entire data set.

Mdl = fitrtree(X,MPG);

Mdl is a RegressionTree model.

Cross-validate the regression tree using 10-fold cross-validation.

CVMdl = crossval(Mdl);

CVMdl is a RegressionPartitionedModel cross-validated model. crossval stores the ten trained, compact regression trees in the Trained property of CVMdl.

Display the compact regression tree that crossval trained using all observations except those in the first fold.

CVMdl.Trained{1}
ans = 
  CompactRegressionTree
           PredictorNames: {'x1'  'x2'  'x3'  'x4'}
             ResponseName: 'Y'
    CategoricalPredictors: []
        ResponseTransform: 'none'


  Properties, Methods

Estimate the generalization error of Mdl by computing the 10-fold cross-validated mean-squared error.

L = kfoldLoss(CVMdl)
L = 23.5706

Tips

  • Assess the predictive performance of model on cross-validated data using the “kfold” methods and properties of cvmodel, such as kfoldLoss.

Alternatives

You can create a cross-validation tree directly from the data, instead of creating a decision tree followed by a cross-validation tree. To do so, include one of these five options in fitrtree: 'CrossVal', 'KFold', 'Holdout', 'Leaveout', or 'CVPartition'.

See Also

|