Estimate loss using cross-validation
returns a 10-fold cross-validation error estimate for the function
err
= crossval(criterion
,X
,y
,'Predfun',predfun
)predfun
based on the specified criterion
,
either 'mse'
(mean squared error) or 'msc'
(misclassification rate). The rows of X
and y
correspond to observations, and the columns of X
correspond to
predictor variables.
In this case, crossval
performs 10-fold cross-validation as follows:
Split the observations in the predictor data X
and the
response variable y
into 10 groups, each of which has
approximately the same number of observations.
Use the last nine groups of observations to train a model as specified in
predfun
. Use the first group of observations as test data,
pass the test predictor data to the trained model, and compute predicted values as
specified in predfun
. Compute the error specified by
criterion
.
Use the first group and the last eight groups of observations to train a model
as specified in predfun
. Use the second group of observations
as test data, pass the test data to the trained model, and compute predicted values
as specified in predfun
. Compute the error specified by
criterion
.
Proceed in a similar manner until each group of observations is used as test data exactly once.
Return the mean error estimate as the scalar err
.
performs 10-fold cross-validation for the function values
= crossval(fun
,X
)fun
, applied to
the data in X
. The rows of X
correspond to
observations, and the columns of X
correspond to variables.
crossval
typically performs 10-fold cross-validation as follows:
Split the data in X
into 10 groups, each of which has
approximately the same number of observations.
Use the last nine groups of data to train a model as specified in
fun
. Use the first group of data as a test set, pass the test
set to the trained model, and compute some value (for example, loss) as specified in
fun
.
Use the first group and the last eight groups of data to train a model as
specified in fun
. Use the second group of data as a test set,
pass the test set to the trained model, and compute some value as specified in
fun
.
Proceed in a similar manner until each group of data is used as a test set exactly once.
Return the 10 computed values as the vector values
.
___ = crossval(___,
specifies cross-validation options using one or more name-value pair arguments in addition
to any of the input argument combinations and output arguments in previous syntaxes. For
example, Name,Value
)'KFold',5
specifies to perform 5-fold cross-validation.
A good practice is to use stratification (see Stratify
) when you
use cross-validation with classification algorithms. Otherwise, some test sets might not
include observations for all classes.
Many classification and regression functions allow you to perform cross-validation directly.
When you use fit functions such as fitcsvm
, fitctree
, and fitrtree
, you can specify cross-validation options by using name-value
pair arguments. Alternatively, you can first create models with these fit functions and
then create a partitioned object by using the crossval
object
function. Use the kfoldLoss
and kfoldPredict
object functions to compute the loss and predicted values for the partitioned object.
For more information, see ClassificationPartitionedModel
and RegressionPartitionedModel
.
You can also specify cross-validation options when you perform lasso or elastic net
regularization using lasso
and lassoglm
.
classify
| confusionmat
| cvpartition
| kmeans
| pca
| regress