Optimization, in its most general form, is the process of locating a point that minimizes a real-valued function called the objective function. Bayesian optimization is the name of one such process. Bayesian optimization internally maintains a Gaussian process model of the objective function, and uses objective function evaluations to train the model. One innovation in Bayesian optimization is the use of an acquisition function, which the algorithm uses to determine the next point to evaluate. The acquisition function can balance sampling at points that have low modeled objective functions, and exploring areas that have not yet been modeled well. For details, see Bayesian Optimization Algorithm.
Bayesian optimization is part of Statistics and Machine Learning Toolbox™ because it is well-suited to optimizing hyperparameters of classification and regression algorithms. A hyperparameter is an internal parameter of a classifier or regression function, such as the box constraint of a support vector machine, or the learning rate of a robust classification ensemble. These parameters can strongly affect the performance of a classifier or regressor, and yet it is typically difficult or time-consuming to optimize them. See Bayesian Optimization Characteristics.
Typically, optimizing the hyperparameters means that you try to minimize the cross-validation loss of a classifier or regression.
You can perform a Bayesian optimization in several ways:
fitcauto
and fitrauto
— Pass predictor and response data to the
fitcauto
or fitrauto
function
to optimize across a selection of model types and hyperparameter values.
Unlike other approaches, using fitcauto
or
fitrauto
does not require you to specify a single
model before the optimization; model selection is part of the optimization
process. The optimization minimizes cross-validation loss, which is modeled
using a multi-TreeBagger
model in
fitcauto
and a multi-RegressionGP
model in
fitrauto
, rather than a single Gaussian process
regression model as used in other approaches. See Bayesian Optimization for
fitcauto
and Bayesian Optimization for
fitrauto
.
Classification Learner and Regression Learner apps — Choose Optimizable models in the machine learning apps and automatically tune their hyperparameter values by using Bayesian optimization. The optimization minimizes the model loss based on the selected validation options. This approach has fewer tuning options than using a fit function, but allows you to perform Bayesian optimization directly in the apps. See Hyperparameter Optimization in Classification Learner App and Hyperparameter Optimization in Regression Learner App.
Fit function — Include the
OptimizeHyperparameters
name-value pair in many
fitting functions to apply Bayesian optimization automatically. The
optimization minimizes cross-validation loss. This approach gives you fewer
tuning options than using bayesopt
, but enables you to
perform Bayesian optimization more easily. See Bayesian Optimization Using a Fit Function.
bayesopt
— Exert the
most control over your optimization by calling bayesopt
directly. This approach requires you to write an objective function, which
does not have to represent cross-validation loss. See Bayesian Optimization Using bayesopt.
To minimize the error in a cross-validated response via Bayesian optimization, follow these steps.
Choose your classification or regression solver among fitcdiscr
, fitcecoc
, fitcensemble
, fitckernel
, fitcknn
, fitclinear
, fitcnb
, fitcsvm
, fitctree
, fitrensemble
, fitrgp
, fitrkernel
, fitrlinear
, fitrsvm
, or fitrtree
.
Decide on the hyperparameters to optimize, and pass
them in the OptimizeHyperparameters
name-value
pair. For each fit function, you can choose from a set of hyperparameters.
See Eligible Hyperparameters for Fit Functions, or use the hyperparameters
function, or consult the
fit function reference page.
You can pass a cell array of parameter names. You can also set 'auto'
as
the OptimizeHyperparameters
value, which chooses
a typical set of hyperparameters to optimize, or 'all'
to
optimize all available parameters.
For ensemble fit functions fitcecoc
, fitcensemble
,
and fitrensemble
, also include parameters of
the weak learners in the OptimizeHyperparameters
cell
array.
Optionally, create an options structure for the
HyperparameterOptimizationOptions
name-value pair.
See Hyperparameter Optimization Options for Fit Functions.
Call the fit function with the appropriate name-value pairs.
For examples, see Optimize an SVM Classifier Fit Using Bayesian Optimization and Optimize a Boosted Regression Ensemble. Also, every fit function reference page contains a Bayesian optimization example.
bayesopt
To perform a Bayesian optimization using bayesopt
,
follow these steps.
Prepare your variables. See Variables for a Bayesian Optimization.
Create your objective function. See Bayesian Optimization Objective Functions. If necessary, create constraints, too. See Constraints in Bayesian Optimization. To include extra parameters in an objective function, see Parameterizing Functions.
Decide on options, meaning the bayseopt
Name,Value
pairs.
You are not required to pass any options to bayesopt
but
you typically do, especially when trying to improve a solution.
Call bayesopt
.
Examine the solution. You can decide to resume the optimization by using resume
, or restart the
optimization, usually with modified options.
For an example, see Optimize a Cross-Validated SVM Classifier Using bayesopt.
Bayesian optimization algorithms are best suited to these problem types.
Characteristic | Details |
---|---|
Low dimension | Bayesian optimization works best in a low number of dimensions, typically 10 or fewer. While Bayesian optimization can solve some problems with a few dozen variables, it is not recommended for dimensions higher than about 50. |
Expensive objective | Bayesian optimization is designed for objective functions that are slow to evaluate. It has considerable overhead, typically several seconds for each iteration. |
Low accuracy | Bayesian optimization does not necessarily give very
accurate results. If you have a deterministic objective function,
you can sometimes improve the accuracy by starting a standard optimization
algorithm from the |
Global solution | Bayesian optimization is a global technique. Unlike many other algorithms, to search for a global solution you do not have to start the algorithm from various initial points. |
Hyperparameters | Bayesian optimization is well-suited to optimizing hyperparameters of
another function. A hyperparameter is a parameter that controls the
behavior of a function. For example, the |
Eligible Hyperparameters for Fit Functions
When optimizing using a fit function, you have these options available in the
HyperparameterOptimizationOptions
name-value pair. Give the
value as a structure. All fields in the structure are optional.
Field Name | Values | Default |
---|---|---|
Optimizer |
| 'bayesopt' |
AcquisitionFunctionName |
Acquisition functions whose names include
| 'expected-improvement-per-second-plus' |
MaxObjectiveEvaluations | Maximum number of objective function evaluations. | 30 for 'bayesopt' or 'randomsearch' , and the entire grid for 'gridsearch' |
MaxTime | Time limit, specified as a positive real. The time limit is in seconds, as measured by | Inf |
NumGridDivisions | For 'gridsearch' , the number of values in each dimension. The value can be
a vector of positive integers giving the number of
values for each dimension, or a scalar that
applies to all dimensions. This field is ignored
for categorical variables. | 10 |
ShowPlots | Logical value indicating whether to show plots. If true , this field plots
the best objective function value against the
iteration number. If there are one or two
optimization parameters, and if
Optimizer is
'bayesopt' , then
ShowPlots also plots a model of
the objective function against the
parameters. | true |
SaveIntermediateResults | Logical value indicating whether to save results when Optimizer is
'bayesopt' . If
true , this field overwrites a
workspace variable named
'BayesoptResults' at each
iteration. The variable is a BayesianOptimization object. | false |
Verbose | Display to the command line.
For details, see the
| 1 |
UseParallel | Logical value indicating whether to run Bayesian optimization in parallel, which requires Parallel Computing Toolbox™. Due to the nonreproducibility of parallel timing, parallel Bayesian optimization does not necessarily yield reproducible results. For details, see Parallel Bayesian Optimization. | false |
Repartition | Logical value indicating whether to repartition the cross-validation at every iteration. If
| false |
Use no more than one of the following three field names. | ||
CVPartition | A cvpartition object, as created by cvpartition . | 'Kfold',5 if you do not specify any cross-validation
field |
Holdout | A scalar in the range (0,1) representing the holdout fraction. | |
Kfold | An integer greater than 1. |
BayesianOptimization
| bayesopt